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Phytases (myoinositol hexakisphosphate phosphohydrolases; EC 3.1.3.8) are enzymes that hydrolyze 
phytate (/T7yo-inositol hexakisphosphate) to /nyo- inositol and inorganic phosphate and are known to be 
valuable feed additives. 

A phytase was first described in rice bran in 1907 [Suzuki et al., Bull. Coll. Agr. Tokio Imp. Univ. 7, 495 
(1907)] and phytases from Aspergillus species in 1911 (Dox and Golden. J. Biol. Chem. J_0, 183-186 (T91 1)- 
]. Phytases have also been found in wheat bran, plant seeds, animal intestines and "in microorganisms 
[Howsen and Davis, Enzyme Microb. Technol. 5, 377-382 (1983). Lambrechts et al., Biotech. Lett. 14. 61-66 
(1992), Shieh and Ware. Appl. fvlicrobiol. 16. 1348-1351 (1968)]. ~~ 

The cloning and expression of the phytase from Aspergillus niger (ficuum) has been described by 
VanHaningsveldt et al., in Gene, 127. 87-94 (1993) and in European Patent Application, Publication No. 420 
358 and from Aspergillus niger var awamori by Piddington et al. in Gene 133. 55-62 (1993). 

Since phytases used so far in agriculture have certain disadvantages it is an object of the present 
invention to provide new phytases or more generally speaking polypeptides with phytase activity against 
inositol phosphates including phytases ("phytase activity") in large quantities with improved properties. 
Since it is known that phytases used so far loose activity during the feed pelleting process due to heat 
treatment, improved heat tolerance would be such a property. 

So far phytases have not been reported in thermotolerant fungus with the exception of Aspergillus 
fumigatus [Dox and Golden et al.. J. Biol. Chem. 10. 183-186 (1911)] and Rhizopus oryzae [Howson and 
Davies. Enzyme Microb. Technol. 5. 377-382 (1993)]. Thermotolerant phytases have been described 
originating from Aspergillus terreus Strain 9A-1 [Temperature optimum 70 'C; Yamada et al.. Agr. Biol. 
Chem. 32, 1275-1282 (1968)] and Schwanniomyces castellii [Temperature optimum 77 -C; Segueiiha et 
al.. Bioeng. 74. 7-11 (1992)]. However for commercial use in agriculture such phytases must be available in 
large quantities. Accordingly it is an object of the present invention to provide DNA sequences coding for 
heat tolerant phytases. Improved heat tolerance of phytases encoded by such DNA sequences can be 
determined by assays known in the art. e.g. by the processes used for feed pelleting or assays determing 
the heat dependence of the enzymatic activity itself as described, e.g. by Yamada et al. (s.a.). 

It is furthermore an object of the present invention to screen fungi which show a certain degree of 
thermotolerance for phytase production. Such screening can be made as described, e.g. in Example 1. In 
this way heat tolerant fungal strains, listed in Example 1. have been identified for the first time to produce a 
phytase. 

Heat tolerant fungal strains, see e.g. those listed in Example 1. can than be grown as known in the art. 
e.g. as indicated by their supplier, e.g. the American Tissue Type Culture Collection (ATCC). Deutsche 
Sammlung von Mikroorganismen und Zellkulturen GmbH (DSM). Agricultural Research Service Culture 
Collection (NRRL) and the Centralbureau voor Schimmelcultures (CBS) from which such strains are 
available or as indicated, e.g. in Example 2. 

Further improved properties are. e.g. an improved substrate specificity regarding phytic acid [myo- 
inositol (1.2,3,4,5.6) hexakisphosphate] which is a major storage form of phosphorous in plants and seeds. 
For the complete release of the six phosphate groups from phytic acid an enzyme is required with sufficient 
activity against phytic acid and all other inositol phosphate molecules. Using e.g. Aspergillus niger phytase 
requires for this complete release the addition of the pH 2.5 acid phosphatase. Having only one enzyme 
with the required activity would be of clear advantage. For example. International Patent Application 
Publication No. 94/03072 discloses an expression system which allows the expression of a mixture of 
phytate degrading enzymes in desired ratios. However, it would be even more desirable to have both such 
activities in a single polypeptide. Therefore it is also an object of the present invention to provide DNA 
sequences coding for such polypeptides. Phytase and phosphatase activities can be determined by assays 
known in the state of the art or described, e.g. in Example 9. 

Another improved property is, e.g. a so called improved pH-profile. This means, e.g. two phytin 
degrading activity maxima, e.g. one at around pH 2.5 which could be the pH in the stomach of certain 
animals and another at around pH 5.5 which could be the pH after the stomach in certain animals. Such pH 
profile can be determined by assays known in the state of the art or described, e.g. in Example 9. 
Accordingly it is also an object of the present invention to provide DNA sequences coding for such 
improved polypeptides. 

In general it is an object of the present invention to provide a DNA sequence coding for a polypeptide 
having phytase activity and which DNA sequence is derived from a fungus selected from the group 
consisting of Acrophlalophora levis. Aspergillus terreus, Aspergillus fumigatus, Aspergillus nidulans, Asper- 
gillus sojae. Calcarisporiella thermophila. Chaetomium rectopifium. Corynascus thermophilus. Humicola sp., 
Mycelia sterilia, Myrococcum thermophilum, Myceliophthora thermophila, Rhizomucor miehei. Sporotrichum 
cellulophilum, Sporotrichum thermophife, Scytalidium indonesicum and Talaromyces thermophilus or a DNA 
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sequence coding for a fragnnent of such a polypeptide which fragnnent still has phytase activity, or nnore 
specifically such a DNA sequence wherein the fungus is selected from the group consisting of Ac- 
rophialophora levis, Aspergillus fumigatus. Aspergillus nidulans, Aspergillus terreus. Calcarisporiella ther- 
mophila. Chaetomium rectopilium, Corynascus thermophilus. Sporotrichum cellulophilum. Sporotrichum 
thernnophile, Mycelia sterilia, Myceliophthora thermophila and Talaronnyces thermophilus. or more specifi- 
cally such a DNA sequence wherein the fungus is selected from the group consisting of Aspergillus terreus. 
Myceliophthora thermophila, Aspergillus fumigatus, Aspergillus nidulans and Talaromyces thermophilus. 
DNA sequences coding for a fragment of a polypeptide of the present invention can. e.g. be between 1350 
and 900, preferably between 900 and 450 and most preferably between 450 and 1 50 nucleotides long and 
can be prepared on the basis of the DNA sequence of the complete polypeptide by recombinant methods 
or by chemical synthesis with which a man skilled in the art is familiar with. 

Furthermore it is an object of the present invention to provide a DNA sequence which codes for a 
polypeptide having phytase activity and which DNA sequence is selected from the following: 

(a) the DNA sequence of Figure 1 [SEQ ID N0:1] or its complementary strand; 

(b) a DNA sequence which hybridizes under standard conditions with sequences defined under (a) or 
preferably with the coding region of such sequences or more preferably with a region between positions 
491 to 1856 of such DNA sequences or even more preferably with a genomic probe obtained by 
preferably random priming using DNA of Aspergillus terreus 9A1 as described in Example 12. 

(c) a DNA sequence which, because of the degeneracy of the genetic code, does not hybridize with 
sequences of (a) or (b), but which codes for polypeptides having exactly the same amino acid 
sequences as the polypeptides encoded by these DNA sequences; and 

(d) a DNA sequence which is a fragment of the DNA sequences specified^in (a), (b) or (c). 

"Standard conditions" for hybridization mean in this corttext the conditions which are generally used by 
a man skilled in the art to detect specific hybridization signals and which are described, e.g. by Sambrook 
et al., "Molecular Cloning" second edition. Cold Spring Harbor Laboratory Press 1989, New York, or 
preferably so called stringent hybridization and non-stringent washing conditions or more preferably so 
called stringent hybridization and stringent washing conditions a man skilled in the art is familiar with and 
which are described, e.g. in Sambrook et al. (s.a.) or even more preferred the stringent hybridization and 
non-stringent or stringent washing conditions as given in Example 12. "Fragment of the DNA sequences" 
means in this context a fragment which codes for a polypeptide still having phytase activity as specified 
aDove. 

It is also an object of the present invention to provide a DNA sequence which codes for a polypeptide 
having phytase activity and which DNA sequence is selected from the following: 

(a) the DNA sequence of Figure 2 [SEQ ID N0:3] or its complementary strand; 

(b) a DNA sequence which hybridizes under standard conditions with sequences defined under (a) or 
preferably a region which extends to about at least 80 % of the coding region optionally comprising 
about between 100 to 150 nucleotides of the 5'end of the non-coding region of such DNA sequences or 
more preferably with a region between positions 2068 to 3478 of such DNA sequences or even more 
preferably with a genomic probe obtained by preferably random priming using DNA of Myceliophthora 
thermophila as described in Example 12. 

(c) a DNA sequence which, because of the degeneracy of the genetic code, does not hybridize with 
sequences of (a) or (b), but which codes for polypeptides having exactly the same amino acid 
sequences as the polypeptides encoded by these DNA sequences: and 

(d) a DNA sequence which is a fragment of the DNA sequences specified in (a), (b) or (c). 
"Fragments" and "standard conditions" have the meaning as given above. 

It is also an object of the present invention to provide a DNA sequence which codes for a polypeptide 
having phytase activity and which DNA sequence is selected from the following: 

(a) a DNA sequence comprising one of the DNA sequences of Figures 4 [SEQ ID N0:5]. 5 [SEQ ID 
N0:7]. 6 [SEQ ID N0:9] or 10 ["aterr21". SEQ ID N0:13: "aterr58": SEQ ID N0:14] or its complementary 
strand; 

(b) a DNA sequence which hybridizes under standard conditions with sequences defined under (a) or 
preferably with such sequences comprising the DNA sequence of Figure 4 [SEQ ID N0:5] isolatable 
from Talaromyces thermophilus, or of Figure 5 [SEQ ID N0:7] isolatable from Aspergillus fumigatus, or 
of Figure 6 [SEQ ID N0:9] isolatable from Aspergillus nidulans or of one or both of the sequences given 
in Figure 10 ["aterr21". SEQ ID N0:13: "aterr58*': SEQ ID NO:14) Isolatable from Aspergillus terreus 
(CBS 220.95) or more preferably with a region of such DNA sequences spanning at least 80 % of the 
coding region or most preferably with a genomic probe obtained by random priming using DNA of 
Talaromyces thermophilus or Aspergillus fumigatus or Aspergillus nidulans or Aspergillus terreus (CBS 
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220.95) as described in Example 12. 

(c) a DNA sequence which, because of the degeneracy of the genetic code, does not hybridize with 
sequences of (a) or (b) but which codes for polypeptides having exactly the same amino acid sequences 
as the polypeptides encoded by these DNA sequences; and 
5 (d) a DNA sequence which is a fragment of the DNA sequences specified in (a), (b) or (c). 

It IS furthermore an object of the present inv ntion to provide a DNA sequence which codes for a 
polypeptide having phytase activity and which DNA sequence is selected from a DNA sequence comprising 
the DNA sequence of Figure 4 [SEQ ID N0:5] isolatable from Talaromyces thermophilus, of Figure 5 [SEQ 
ID NO:7] isolatable from Aspergillus fumigatus. of Figure 6 [SEQ ID N0:9] isolatable from Aspergillus 
w nidulans or of Figure 10 ["aterr2l": SEQ ID N0:13: -aterr58-:SEQ ID NO:14] isolatable from Aspergillus 
terreus (CBS 220.95) or which DNA sequence is a degenerate variant or equivalent thereof. 

"Fragments" and "standard conditions "have the meaning as given above. "Degenerate variant" means 
in this context a DNA sequence which because of the degeneracy of the genetic code has a different 
nucleotide sequence as the one referred to but codes for a polypeptide with the same amino acid 
75 sequence. "Equivalent" refers in this context to a DNA sequence which codes for polypeptides having 
phytase activity with an amino acid sequence which differs by deletion, substitution and/or addition of one 
or more amino acids, preferably up to 50. more preferably up to 20. even more preferably up to 10 or most 
preferably 5. 4, 3 or 2, from the amino acid sequence of the polypeptide encoded by the DNA sequence to 
which the equivalent sequence refers to. Amino acid substitutions which do not generally alter the specific 
20 activity are Known in the state of the art and are described, for example, by H. Neurath and R L Hill in "The 
Proteins" (Academic Press, New York. 1979, see especially Figure 6. page 14). The most commonly 
occurnng exchanges are: Ala/Ser. Val/lle, Asp/Glu, Thr/Ser, Ala/Gly, Alan^hr, Ser/Asn, Ala/Val Ser/Gly 
Tyr/Phe, Ala/Pro. Lys/Arg. Asp/Asn. Leu/lle, LeuA/al. Ala/Glu, Asp/Gly as "Well as these in reverse (the three 
letter abbreviations are used for amino acids and are standard and known in the art). 
25 Such equivalents can be produced by methods known in the state of the art and described, e.g. in 
Sambrook et al. (s.a.). Whether polypeptides encoded by such equivalent sequences still have a phytase 
activity can be determined by one of the assays known in the art or, e.g. described in Example 9. 

It is also an object of the present invention to provide one of the aforementioned DNA sequences which 
code for a polypeptide having phytase activity which DNA sequence is derived from a fungus, or more 
30 specifically such a fungus selected from one of the above mentioned specific groups of fungi. 

Furthermore it is an object of the present invention to provide a DNA sequence which codes for a 
polypeptide having phytase activity and which DNA sequence hybridizes under standard conditions with a 
probe which is a product of a PGR reaction with DNA isolated from a fungus of one of the above mentioned 
groups of fungi and the following pair of PGR primer: 
35 "ATGGA(C/T)ATGTG(C/T)TC(N)TT(C,T)GA" (SEQ ID NO:1 5) as sense primer and 

"TT(A.G)CC(A/G)GC(A/G)CC(G/A)TG(N)CC(A/G)TA" [SEQ ID NO: 16] as anti-sense primer. 
"Standard conditions" have the meaning given above. "Product of a PGR reaction" means preferably a 
product Obtainable or more preferably as obtained by a reaction described in Example 12 referrinq back to 
Example 11. 

"0 Furthermore it is an object of the present invention to provide a DNA sequence which codes for a 
polypeptide having phytase activity and which DNA sequence hybridizes under standard conditions with a 
probe which is a product of a PGR reaction with DNA isolated from Aspergillus terreus (GBS 220 95) and 
the following two pairs of PGR primers: 

(a) "ATGGA(G/T)ATGTG(G/T)TG(N)TT(G/T)GA" (SEQ ID NO: 15] as the sense primer and 

05 "TT{A/G)CC(A/G)GC(A/G)GG(G/A)TG(N)GG(A/G)TA" [SEQ ID N0:16] as the anti-sense primer- and 

(b) "TA(C/T)GG(N)GA(C/T)TT{C/T)TC(N)CA(Cn-)GA" (SEQ ID NO: 17] as the sense primer and 
"CG(G/A)TC(G/A)TT(N)AG(N)AG(N)AC(N)C" (SEQ ID NO: 18] as the anti-sense primer. 

"Standard conditions" are as defined above and the term "product of a PGR reaction" means 
preferably a product obtainable or more preferably as obtained by a reaction described in Example 1 1 . 
50 It is furthermore an object of the present invention to provide a DNA sequence coding for a chimeric 
construct having phytase activity which chimeric construct comprises a fragment of a DNA sequence as 
specified above or preferably such a DNA sequence wherein the chimeric construct consists at its N- 
terminal end of a fragment of th Aspergillus niger phytase fused at its C-terminal end to a fragment of the 
Aspergillus terreus phytase, or more preferably such a DNA sequence with the specific nucleotide 
.5 sequence as shown in Figure 7 (SEQ ID N0:11] and a degenerate variant or equivalent thereof, wherein 
degenerate vanant" and "equivalent" have the meanings as given above. 

Furthermore it is an object of the present invention to provide a DNA sequence as specified above 
wherein the encoded polypeptide is a phytase. 
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Genomic DNA or cDNA from fungal strains can be prepared as known in the art [see e.g. Yelton et al.. 
Procd. Natl. Acad. Sci. USA, 1470-1474 (1984) or Sambrook et aL. s.a.. or, e.g. as specifically described in 
Example 2. 

The cloning of the DNA-sequences of the present invention from such genomic DNA can than be 

5 effected, e.g. by using the well known polymerase chain reaction (PGR) method. The principles of this 
method are outlined e.g. by White et al. (1989). whereas improved methods are described e.g. in Innis et al. 
[PGR Protocols: A guide to Methods and Applications. Academic Press. Inc. (1990)], PGR is an in vitro 
method for producing large amounts of a specific DNA of defined length and sequence from a mixture of 
. different DNA-sequences. Thereby. PGR is based on the enzymatic amplification of the specific DNA 

w fragment of interest which is flanked by two oligonucleotide primers which are specific for this sequence 
and which hybridize to the opposite strand of the target sequence. The primers are oriented with their 3* 
ends pointing toward each other. Repeated cycles of heat denaturation of the template, annealing of the 
primers to their complementary sequences and extension of the annealed primers with a DNA polymerase 
result in the amplification of the segment between the PGR primers. Since the extension product of each 

75 primer can serve as a template for the other, each cycle essentially doubles the amount of the DNA 
fragment produced in the previous cycle. By utilizing the thermostable Taq DNA polymerase, isolated from 
the thermophilic bacteria Thermus aquaticus. it has been possible to avoid denaturation of the polymerase 
which necessitated the addition of enzyme after each heat denaturation step. This development has led to 
the automation of PGR by a variety of simple temperature-cycling" devices. In addition, the specificity of the 

20 amplification reaction is increased by allowing the use of higher temperatures for primer annealing and 
extension. The increased specificity improves the overall yield of amplified products by minimizing the 
competition by non-target fragments for enzyme and primers. In this way the specific sequence of interest 
is highly amplified and can be easily separated from the non-specific seqilrences by methods known in the 
art, e.g. by separation on an agarose gel and cloned by methods known in the art using vectors as 

25 described e.g. by Holten and Graham in Nucleic Acid Res. IQ, 1156 (1991), Kovalic et. al. in Nucleic Acid 
Res. 19, 4560 (1991). Marchuk et al. in Nucleic Acid Res. 19. 1154 (1991) or Mead et al. in Bio/Technology 
9. 657-663 (1991). 

The oligonucleotide primers used in the PGR procedure can be prepared as known in the art and 
described e.g. in Sambrook et al. (1989 "Molecular cloning'* 2nd edt.. Gold Spring Harbor Laboratory Press, 
30 Cold Spring Harbor). 

Ihe specific primers used in the practice of the present invention have been designed as degenerate 
primers on the basis of DNA-sequence comparisons of known sequences of the Aspergillus niger phytase, 
the Aspergillus niger acid phosphatase, the Saccharomyces cerevisiae acid phosphatase and the 
Schizosaccharomyces pombe acid phosphatase (for sequence information see, e.g. European Bioinfor- 

35 matics Institute (Hinxton Hall. Cambridge, GB), The degeneracy of the primers was reduced by selecting 
some codons according to a codon usage table of Aspergillus niger prepared on the basis of known 
sequences from Aspergillus niger. Furthermore it has been found that the amino acid at the G-terminal end 
of the amino acid sequences used to define the specific probes should be a conserved amino acid in all 
acid phosphatases including phytases specified above but the rest of the amino acids should be more 

40 phytase than phosphatase specific. 

Such amplified DNA-sequences can than be used to screen DNA libraries of DNA of, e.g. fungal origin 
by methods known in the art (Sambrook et al.. s.a.) or as specifically described in Examples 5-7. 

Once complete DNA-sequences of the present invention have been obtained they can be integrated 
into vectors by methods known in the art and described e.g. in Sambrook et al. (s.a.) to overexpress the 

45 encoded polypeptide in appropriate host systems. However, a man skilled in the art knows that also the 
DNA-sequences themselves can be used to transform the suitable host systems of th6 invention to get 
overexpression of the encoded polypeptide. Appropriate host systems are for example fungi, like Aspergilli, 
e.g. Aspergillus niger [ATGG 9142] or Aspergillus ficuum [NRRL 3135] or like Trichoderma. e.g. 
Trichoderma reesei or yeasts, like Saccharomyces. e.g. Saccharomyces cerevisiae or Pichia. like Pichia 

50 pastoris, all available from ATCC. Bacteria which can be used are e.g. E. coli. Bacilli as, e.g. Bacillus 
subtilis or Streptomyces, e.g. Streptomyces lividans (see e.g. Anne and Mallaert in FEMS Microbiol. Letters 
1 14 , 121 (1993). E. coli. which could be used are E. coli K12 strains e.g. Ml 5 [described as DZ 291 by 
Villarejo et al. in J. Bacteriol. V20, 466-474 (1974)]. HB 101 [ATGG No. 33694] or E. coil SG13009 
[Gottesman et al., J. Bacteriol. 148. 265-273 (1981)], 

55 Vectors which can be used for expression in fungi are known in the art and described e.g. in EP 420 
358. or by Cullen et al. [Bio/Technology 5. 369-376 (1987)] or Ward in Molecular Industrial Mycology, 
Systems and Applications for Filamentous Fungi. Marcel Dekker. New York (1991), Upshall et al. 
[Bio/Technology 5. 1301-1304 (1987)] Gwynne et al. [BiorTechnology 5. 71-79 (1987)], Punt et al. [J. of 

5 
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Biotechnology r7. 19-34 (1991)] and for yeast by Sreekrishna et al. [J. Basic Microbiol. 28. 265-278 (1988) 
Biochem. 28. 4117-4125 (1989)]. Hitzemann et al. [Nature 293. 717-722 (1981)] or in EP"l83 070. EP 183 
071, EP 248 227, EP 263 311. Suitable vectors which can be used for expression in E. coli are mentioned 
e.g. by Sambrook et al. [s.a.] or by Rers et al. in Procd. 8th Int. Biotechnology Symposium" [See Franc de 
s Microbiol., Paris (Durand et al., eds.), pp. 680-697 (1988)] or by Bujard et al. in Methods In Enzymology 
eds. Wu and Grossmann, Academic Press. Inc. Vol. 155. 416-433 (1987) and Stuber et al. In Immunological 
Methods, eds. Lefkovits and Pernis, Academic Press. Inc.. Vol. IV, 121-152 (1990). Vectors which could be 
used for expression in Bacilli are known in the art and described, e.g. in EP 405 370, Procd. Nat. Acad. Sci. 
USA §2, 439 (1984) by Yansura and Henner, Meth. Enzym. 185, 199-228 (1990) or EP 207 459. 
10 Either such vectors already carry regulatory elements, e.g. promoters or the DNA-sequences of the 
present invention can be engineered to contain such elements. Suitable promotor-elements which can be 
used are known in the art and are. e.g. for Trichoderma reesei the cbhl- (Haarki et al., Biotechnology 7 
596-600 (1989)) or the pkil-promotor [Schindler et al.. Gene }30. 271-275 (1993)). for Aspergillus oryzae 
the amy-promotor [Christensen et al.. Abstr. 19th Lunteren Lectures on Molecular Genetics F23 (1987) 
/5 Chnstensen et al.. Biotechnology 6. 1419-1422 (1988). Tada et al., Mol. Gen. Genet. 229. 301 (I99i)] for 
Aspergillus niger the glaA- [Cullen et al.. Bio/Technology 5. 369-376 (1987). Gwynne et al., Bio/Technlogy 5 
713-719 (1987), Ward in Molecular Industrial Mycology. Systems and Applications for Filamentous Fungi 
Marcel Oekker, New York. 83-106 (1991)]. aIcA- [Gwynne-et al.. Bio/Technology 5. 71-719 (1987)] sucl- 
[Boddy et al. Current Genetics 24, 60-66 (1993)], aphA- [MacRae et al.. Gene 71, 339-348 (1988) MacRae 
20 et al.. Gene 132. 193-198 (1993)]. tpiA- [McKnight et al.. Cell 46. 143-147 (1986). Upshall et al 
Bion-echnology 5. 1301-1304 (1987)). gpdA- [Punt et al.. Gene 69, 49-57 (1988). Punt et al J of 
Biotechnology 17, 19-37 (1991)] and the pkiA-promotor [de Graaff.^eJ al.. Curr. Genet. 22. 21-27 (1992)] 
Suitable promotor-elements which could be used for expression in yeast are known in the art and are e g 
the pho5-promotor [Vogel et al.. Molecular and Cellular Biology, 2050-2057 (1989); Rudolf and Hinnen 
25 Proc. Natl. Acad. Sci. 84, 1340-1344 (1987)] or the gap-promotor for expression in Saccharamyces 
cerev.siae und for Pichia pastoris, e.g. the aoxl -promoter [Koutz et al. Yeast 5. 167-177 (1989)- Sreekrishna 
et al., J. Basic Microbiol. 28, 265-278 (1988)]. 

Accordingly vectors comprising DNA sequences of the present invention, preferably for the expression 
of saio DNA sequences in bacteria or a fungal or a yeast host and such transformed bacteria or fungal or 
30 yeast hosts are also an object of the present invention. 

Cnce such DNA-sequences have been expressed in an appropriate host cell in a suitable medium the 
encoded phytase can be isolated either from the medium in the case the phytase is secreted into the 
medium or from the host organism in case such phytase is present intracellularly by methods known in the 
art of protein purification or described, e.g. In EP 420 358. Accordingly a process for the preparation of a 
35 polypeptide of the present invention characterized in that transformed bacteria or a host cell as described 
above IS cultured under suitable culture conditions and the polypeptide is recovered therefrom and a 
polypeptide when produced by such a process or a polypeptide encoded by a DNA sequence of the 
present invention are also an object of the present invention. 

Once obtained the polypeptides of the present Invention can be characterized regarding their activity by 
40 assays known in the state of the art or as described, e.g. by Engelen et al. [J. AOAC Intern. 77. 760-764 
(1994)] or. in Example 9. Regarding their properties which make the polypeptides of the presem'invention 
useful m agnculture any assay known in the art and described e.g. by Simons et al. (British Journal of 
Nutrition 64. 525-540 (1990)). Schdner et al, [J. Anim. Physiol, a. Anim. Nutr. 66. 248-255 (1991)] Vogt 
[Arch. Geflugelk. 56. 93-98 (1992)]. Jongbloed et al. (J. Anim. Sci.. 70. 1159-1168 (1992)] Pemey'et al 
45 [Poultry Science 72. 2106-21 14 (1993)]. Farrell et al.. [J. Anim. Physiol, a. Anim. Nutr. 69 278-283 (1993) 
Broz et al.. (British Poultry Science 35. 273-280 (1994)] and Dungethoef et al. (AnimalTeed Science and 
Technology 49, 1-10 (1994)] can be used. Regarding their thermotolerance any assay known in the state of 
the art and described, e.g. by Yamada et al. (s.a.). and regarding their pH and substrate specificity profiles 
any assays known in the state of the art and described, e.g. in Example 9 or by Yamada et al s a can be 
50 used. 

In general the polypeptides of the present invention can be used without being limited to a specific field 
of application for the conversion of phytate to inositol and inorganic phosphate. 

Furthermore the polypeptides of the present invention can be used in a process for the preparation of 
compound food or feeds wherein the components of such a composition are mixed with one or more 
55 polypeptides of the present invention. Accordingly compound food or feeds comprising one or more 
polypeptides of the present invention are also an object of the present invention. A man skilled in the art is 
familiar with their process of prepration. Such compound foods or feeds can further comprise additives or 
components generally used for such purpose and known in the state of the art. 
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It is furthermore an object of the present invention to provide a process for the reduction of levels of 
phytate in animat manure characterized in that an animal is fed such a feed composition in an amount 
effective in converting phytate contained in the feedstuff to inositol and inorganic phosphate. 

5 Examples 

Specific media and solutions used 



Complete medium (Clutterbuck) 



TO 



75 



20 



Glucose 


10g/l 


-CN solution 


1 0 ml/I 


Sodium nitrate 


6 g/l 


Bacto peptone (Difco Lab., Detroit. Ml. USA) 


2 g/l 


Yeast Extract (Difco) 


1 g/l 


Casamino acids (Difco) 


1.5 g/l 


Modified trace element solution 


1 ml/1 


Vitamin solution 


1 ml/1 



M3 Medium 



25 



30 



Glucose 


10 g/l 


-CN Solution 


10 ml/t 


Modified trace element solution 


1 ml/1 


Ammonium nitrate 


2 g/l 



M3 Medium - Phosphate 
35 M3 medium except that -CN is replaced with -CNP 
M3 Medium - Phosphate -^ Phytate 

M3 Medium - Phosphate with the addition of 5 g/l of Nai2 Phytate (Sigma #P-3168; Sigma. St. Louis. MO. 
40 USA) 

Modified trace element solution 



45 



50 



CuS04 


0.04% 


FeS04-7H2 0 


0.08% 


Na2Mo04*2H2 0 


0.08% 


ZnS04«7H2 0 


0.8% 


Ba Na2O7-10H2O 


0.004% 


MnS04-H2 0 


0.08% 



55 
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Riboflavin 


0.1% 


Nicotinamide 


0.1% 


p-amino benzoic acid 


0.01% 


Pyridoxine/HCI 


0.05% 


Aneurine/HCI 


0.05% 


Biotin 


0.001% 



-CN Solution 



KH2P04 


140 g/l 


K2P04-3H2 0 


90 g/I 


KCI 


iog/r 


MgS04*7H2O 


10 g/l 



-CNP Solution 



25 



35 



HEPES 


47.6g/200 mis 


KCI 


2 g/200 mis 


MgSOi-TH^O 


2 g/200 mis 



30 



Example 1 

Screening fungi for phytase activity 



Fungi were screened on a three plate system, using the following three media: 
^ (a defined medium containing phosphate). 

(M3 medium lacking phosphate) and 
•'M3-P + Phytate" (f^3 medium lacking phosphate but containing phytate as a sole phosphorus 
source). 

Plates were made with agarose to decrease the background level of phosphate 

Fungi were grown on the medium and at the temperature recommended by the supplier. Either spores or 
mycel.um were transfered to the test plates and incubated at the recommended temperature until growth 
W3S otDserved. 

-*5 The following thermotolerant strains were found to exhibit such growth: 

Myceliophthora thermophila [ATCC 48 102] 
Taiaromyces thermophilus [ATCC 20 186] 
Aspergillus fumlgatus [ATCC 34 625] 

50 Example 2 

Growth of fungi and preparation of genomic DNA 

niHf °1 thermophila, Talaromyces thermophilus. Aspergillus furriigatus. Aspergillus 

.7?f . /""P^'S"'"' '^''^''^ 9^-1 • Aspergillus terreus CBS 220.95 were grown in Potato Dextrose Broth 
(0.fco Lab.. Detroit. Ml. USA) or complete medium (Clutterbuck). Aspergillus terreus 9A-1 and Aspergillus 

BRn r. M^'\ !r fr^''^'^ ^"^^P®"* ^'^^'^ P^'^"* P"^P°s«s =t ♦f^e DSM in Braunschweig 

BHD at March 17, 1994 under accession number DSM 9076 and at February 17, 1995 under accession 



8 



EP 0 684 313 A2 



number DSM 9743. respectively. 
Genomic DNA was prepared as follows: 

Medium was innoculated at a high density with spores and grown 0/N with shaking. This produced a thick 
culture of small fungat pellets. The mycelium was recovered by filtration blotted dry and weighed. Up to 

5 2.0g was used per preparation. The mycelium was ground to a fine powder in liquid nitrogen and 
immediately added to 10 mis of extraction buffer (200 mM Tris/HCI. 250 mM NaCI. 25 mM EDTA. 0.5% 
SDS. pH 8.5) and mixed well. Phenol (7 mis) was added to the slurry and mixed and then chloroform (3 
mis) was also added and mixed well. The mixture was centrifuged (20.000 g) and the aqueous phase 
recovered. RNase A was added to a final concentration of 250 ug/ml and incubated at 37* C for 15 minutes. 

10 The mixture was then extracted with 1 volume of chloroform and centrifuged (10.000 g, 10 minutes). The 
aqueous phase was recovered and the DNA precipitated with 0.54 volumes of RT isopropano! for 1 hour at 
RT. The DNA was recovered by spooling and resuspended in water. 
The resultant DNA was further purified as follows: 

A portion of the DNA was digested with proteinase K for 2 hrs at 37 'C and then extracted repeatedly 
75 (twice to three times) with an equal volume of phenol/chloroform and then ethanol precipitated prior to 
resuspension in water to a concentration of approximately 1 ug/ul. 

Example 3 

20 Degenerate PGR 

PGR was performed essentially according to the protocol of Perkin Elmer Cetus [(PEC); Norwalk. GT. USA]. 
The following two primers were used (bases indicated in brackets are either/or): 

Phyt 8: 5' ATG GA(CT) ATG TG(CT) TCN TT(CT) GA 3' [SEQ ID N0:19] Degeneracy = 32 
25 Tm High = 60 ' C/ Tm Low 52 * 0 

Phyt 9: 5* TT(AG) CC(AG) GC(AG) CC(GA) TGN GG(GA) TA 3* [SEQ ID NO:20] 
Tm High = 70'C'Tm Low 58 'C 
A typical reaction was performed as follows: 



H20 


24.5 ul 


10 X PEC GeneAmp Buffer 


5 ul 


GeneAmp dNTP's (10 mM) 


8 ul 


Primer 1 (Phyt 8. 100 uM) 


5 ul 


Primer 2 (Phyt 9. 100 uM) 


5 ul 


DNA (-1 ug/u\) 


1 ul 


Taq Polymerase (PEC) 


0.5 ul 




50 ul 



40 All components with the exception of the Taq polymerase were incubated at 95 'G for 10 minutes and then 
50 'C for 10 minutes and then the reaction placed on ice. The Taq polymerase (Amplitaq. Hoffmann-La 
Roche. Basel. GH) was then added and 35 cycles of PGR performed in a Triothermoblock (Biometra. 
Gottingen. DE) according to the following cycle profile: 

95 -c/eo" 

45 50 'G/90" 

72 -C/ 120" 

An aliquot of the reaction was analysed on 1 .5% agarose gel. 
Example 4 

50 

Subcloning and sequencing of PGR fragments 

PGR products of the expected size (approximately 146 bp predicted from the Aspergillus niger DNA- 
sequence) were excised from low melting point agarose and purified from a NAGS - PREPAG - column 
55 (BRL Life Technologies Inc.. Gaithersburg. MD. USA) essentially according to the manufacturer's protocol. 
The fragment was polyadenytated in 50 ul 100 mM Sodiumcacodylate pH6-6. 12.5 mM Tris/HCI pH 7.0.0.1 
mM Dithiothreitol, 125 ug/ml bovine serum albumin. 1 mM C0CI2. 20 uMdATP, 10 units terminal 
deoxytransferase (Boehringer Mannheim. Mannheim. DE) for 5 minutes at 37* C and cloned into the p123T 
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vector [Mitchell at a!., PGR Meth. App. 2, 81-82 (1992)]. 

Alternatively. PGR fragments were purified and cloned using the "Sure Glone" ligation kit (Pharmacia) 
following the manufacturers instructions. 

Sequencing was perfornned on dsDNA purified on a Quiagen-column 

(Diagen GmbH, Hilden, DE) using the dideoxy method and the Pharmacia T7 kit (Pharmacia, LKB 
Biotechnology AO, Uppsala. SE) according to the protocol supplied by the manufacturer. 

Example 5 



Construction and Screening of Lambda Fix II libraries 

The fragments from Aspergillus terreus Strain 9A-1 and Myceliophthora thermophila were used to probe 
Bam HI and Bglll southerns to determine the suitable restriction enzyme to use to construct genomic 
libraries in the Lambda Fix II vector (Strategene. La Jolla, CA, USA). Lambda Fix II can only accept inserts 
from 9-23 kb. Southerns were performed according to the following protocol. Genomic DNA (10 ug) was 
digested in a final volume of 200 ul. The reaction without enzyme was prepared and incubated on ice for 2 
hours. The enzyme (50 units) was added and the reaction incubated at the appropriate temperature for 3 
hours. The reaction was then extracted with an equal volume of phenol/chloroform and ethanol precipitated. 
The resuspended DNA in loading buffer was heated to 65 'G for 15 minutes prior to separation on a 0.7% 
agarose gel (0/N 30 V). Prior to transfer the gel was washed twice in 0.2 M HCI/ lOVroom temperature (RT) 
and then twice in 1M NaGI/0.4M NaOH for 15' at RT. The DNA was transfered in 0.4fVl NaOH in a capillary 
transfer for 4 hours to Nytran 13N nylon membrane (Schleicher and Sctiuell AG. Feldbach. Zurich, OH). 
Following transfer the membrane was exposed to UV. [Auto cross-link, UV Stratalinker 2400 Strataaene (La 
Jolla, CA, USA)]. 

The membrane was prehybridized in hybridization buffer [50 % formamide. 1% sodium dodecylsulfate 
(SDS), 10% dextransulfate. 4 x SSPE (180 mM NaCI. 10 mfVI NaHs PO*. 1 mM EDTA. ph 7.4)] for 4 hours 
at 42 • C and following addition of the denatured probe 0/N at 42 - C. The blot was washed: 

1 x SSPE/0.5 % SDS/'RT/30 minutes 

0.1 X SSPE'0.1 % SDS/RT/30 minutes 

0.1 X SSPE/0.1 % SDS/65'C/30 minutes 
Results indicate that Aspergillus terreus Strain 9A-1 genomic DNA digested with BamHl and Mycelioph- 
thora thermophila genomic DNA digested with Bglll produce fragments suitable for cloning into the lambda 
Fix II vector. 

The construction of genomic libraries of Aspergillus terreus Strain 9A-1 and Myceliophthora thermophila 
in Lambda Fix II was performed according to the manufacturer's protocols (Stratagene). 
The lambda libraries were plated out on 10 137 mm plates for each library. The plaques were lifted to 
Nytran 13N round filters and treated for 1 minute in 0.5 M NaOH/1.5 M NaC! followed by 5 minutes in 0.5 fvl 
Tris-HCI pH 8.0/1.5 M NaGI. The filters were then treated in 2 X SSG for 5 minutes and air dried. They were 
then fixed with UV (1 minute, UV Stratalinker 2400. Stratagene). The filters were hybridized and washed as 
above. Putative positive plaques were cored and the phage soaked out in Sfy/1 buffer (180 mM NaCI. 8 mM 
MgSO4-7H20; 20mMTris/HCI pH 7.5. 0.01% gelatin). This stock was diluted and plated out on 137 mm 
plates. Duplicate filters were lifted and treated as above. A clear single positive plaque from each plate was 
picked and diluted in SM buffer. Three positive plaques were picked. Two from Aspergillus terreus Strain 
9A-1 (9A1X17 and 9A1X22) and one from Myceliophthora thermophila (MTX27). 

Example 6 



Preparation of Lambda DNA and confirmation of the clones 



Lambda DNA was prepared from the positive plaques. This was done using the "Magic Lambda Prep" 
system (Promega Corp.. Madison. Wl. USA) and was according to the manufactures specifications. To 
confirm the identity of the clones, the lambda DNA was digested with PstI and Sail and the resultant blot 
probed with the PGR products. In all cases this confirmed the clones as containing sequences complemen- 
tary to the probe. 
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Example 7 

Subcloning and sequencing of phytase genes 

5 DNA from 9AU17 was digested with PstI and the resultant mixture of fragments ligated into pBluescript II 
SK + (Stratagene) cut with PstI and treated with shrimp alkaline phosphatase (United States Biochemical 
Corp.. Cleaveland. OH. USA). The ligation was 0/N at 16 * C. The ligation mixture was transformed into XL- 
1 Blue Supercompetent cells (Stratagene) and plated on LB Plates containing 0.5 mM isopropyl-^-D- 
thiogalactopyranoside (IPTG), 40 ug/ml 5-bromo-4-chIoro-3-indoyt-^-D-galactopyranoside (Xgal). 50 ug/ml 

70 ampicillin. 

DNA from 9AX17 was digested with Bgl II and Xba I and the resultant mixture ligated into pBluescript M 
SK+ digested with BamHl/Xba I. Ligation, transformation and screening were performed as described 
above. 

DNA from MTX27 was digested with Sail and the resultant mixture of fragments ligated into pBluescript II 
T5 SK+ cut with Sail and treated with shrimp alkaline phosphatase. The ligation was 0/N at 16 -C. The 
ligation mixture was transformed into XL-1 Blue Supercompetent cells and plated on LB Plates containing 
Xgat/IPTG and ampicillin. 

Colonies from the above transformations were picked and "gridded" approximately 75 to a single plate. 
Following 0/N incubation at 37 -C the colonies were lifted to a nylon filter ("Hybond-N". Amersham Corp.. 

20 Arlington Heights. IL. USA) and the filters treated with 0.5M NaOH for 3 minutes, 1M Tris/HCI pH7.5 twice 
for 1 minute, then 0.5M Tris/HCI pH7.5/1.5 M NaCl for 5 minutes. The filters were air dried and then fixed 
with UV (2 minutes, UV Stratalinker 2400. Stratagene). The filters were hybndized with the PCR products of 
Example 5. Positive colonies were selected and DNA prepared. The subclones were sequenced as 
previously described in Example 4. Sequences determined are shown in Figure 1 (Fig. 1) for the phytase 

25 from Aspergillus terreus strain 9A1 and its encoding DNA sequence. Figure 2 for the phytase from 
Myceliophthora thermophila and its encoding DNA-sequence. Figure 3A shows a restriction map for the 
DNA of Aspergillus terreus (wherein the arrow indicates the coding region, and the strips the regions 
sequenced in addition to the coding region) and 3B for M. thermophila, and Figure 4 for part of the phytase 
from Talaromyces thermophilus and its encoding DNA sequence. Figure 5 for part of the phytase from 

30 Aspergillus fumigatus and its encoding DNA-sequence and Figure 6 for part of the phytase from Aspergillus 
nidulans and its encoding DNA-sequence. The sequences for the parts of the phytases and their encoding 
DNA-sequences from Talaromyces thermophilus. Aspergillus fumigatus and Aspergillus nidulans were 
obtained in the same way as described for those of Aspergillus terreus strain 9A1 and Myceliophthora 
thermophila in Examples 2-7. Bases are given for both strands in small letters by the typically used one 

35 letter code abbreviations. Derived amino acid sequences of the phytase are given in capital letters by the 
typically used one letter code below the corresponding DNA-sequence. 

Example 8 

40 Construction of a chimeric construct between A. niger and A. terreus phytase DNA-sequences 

All constructions were made using standard molecular biological procedures as described by Sambrook et 

al.. (1989) (Molecular cloning. A laboratory Manual, Cold Spring Harbor Laboratory Press. NY). 

The first 146 amino acids (aa) of the Aspergillus niger phytase. as described in EP 420 358. were fused to 

45 the 320 C-terminal aa of the Aspergillus terreus 9A1 gene. A Ncol site was introduced at the ATG start 
codon when the A. niger phytase gene was cloned by PCR. The intron found in the A. niger phytase was 
removed by site directed mutagenesis (Bio-Rad kit. Cat Nr 170-3581; Bio-Rad. Richmond, CA. USA) using 
the following primer (werein the vertical dash tndictes that the sequence to its left hybridizes to the 3'end of 
the first exon and the sequence to its right hybridizes to the 5'end of the second exon): 

50 5'-AGTCCGGAGGTGACT|CCAGCTAGGAGATAC-3* [SEQ ID N0:2l]. 

To construct the chimeric construct of phytases from A. niger and A. terreus an Eco 47III site was 
introduced into the A. niger coding sequence to aid cloning. PCR with a mutagenic primer (5' CGA TTC 
GTA gCG CTG GTA G 3*) in conjunction with the T3 primer was used to produce a DNA fragment that was 
cleaved with Bam HI and Eco 47I1L The Bam Hl/Eco 47III fragment was inserted into Bam Hl/Eco 47III cut 

55 p9AlPst (Example 7). Figure 7 shows the amino acid sequence of the fusion construct and its encoding 
DNA-sequence. 
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Example 9 

Expression of phytases 

5 Construction of expression vectors 

For expression of the fusion construct in A. niger an expression cassette was chosen where the fusion gene 
was under control of the inducible A. niger glucoamylase (glaA) pronnoter. 

For the complete A terreus 9A1 gene, expression cassettes with the constitutive A. nidulans glyceral- 
10 dehyde-3-phosphate dehydrogenase (gpdA) promoter were made. 

All genes used for expression in A. niger carried their own signal sequence for secretion. 

Construction of vector pFPANI 

;5 The A. niger glucoamylase (glaA) promoter was isolated as a 1960 bp Xhol/Clal fragment from plasmid 
pDH33 [Smith et aL (1990). Gene 88: 259-262] and cloned into pBluescriptSK^-vector (pBS) (Stratagene. La 
Jolla. CA, USA] containing the 710 bp BamHI/Xbal fragment of the A. nidulans trpC terminator. The 
plasmid with the cassette was named pGLAC. The fusion gene, as 'described in Example 8 , was put under 
control of the A. niger glaA promoter by ligating the blunt ended Ncol/EcoRI fragment to the blunt ended 

20 Clal site and the EcoRV site of plasmid pGLAC. The correct orientation was verified by restriction enzyme 
digests. The entire cassette was transferred as a KpnI/Xbal fragment to pUCl9 (New England Biolabs, 
GmbH, Schwalbach, BRD), that carried the Neurospora crassa pyr4 gene (pUCl9-pyr4), a selection 
marker in uridine auxotrophic Aspergitii, resulting in vector pFPANI (see Figure 8 with restriction sites and 
coding regions as indicated; crossed out restriction sites indicate sites with blunt end ligation). 

25 

Construction of vector pPATI 

The A. nidulans glyceraldehyd-3-phosphate dehydrogenase (gpdA) promoter was isolated as a -2.3 kb 
EcoRI/Ncol fragment from plasmid pAN52-1 [Punt et al. (1987). Gene 56: 1 17-124], cloned into pUCl9-Ncol 

30 (pUCl9 having a Smal-site replaced by a Ncol-site). reisolated as EcoRI/ BamHI fragment and cloned into 
pBS with the trpC terminator as described above. The obtained cassette was named pGPDN. The A, 
terreus gene was isolatet as a Ncol/EcoRI fragment, where the EcoRI site was filled in to create blunt ends. 
Plasmid pGPDN was cut with BamHI and Ncol. The BamHI site was filled in to create blunt ends. The 
Ncol/EcoRI(blunt) fragment of the A, terreus gene was cloned between the gpdA promoter and trpC 

35 terminator. The expression cassette was isolated as KpnI/Xbal fragment and cloned into pUCi 9-pyr4 
resulting in plasmid pPATI (see Figure 9; for explanation of abreviations see legend to Figure 8). 

Expression of the fusion protein in Aspergillus niger 

40 A) Transformation 

The plasmid pFPANi was used to transform A. niger by using the transformation protocol as described by 
Ballance et al. [(1983). Biochem. Biophys. Res. Commun 112. 284-289] with some modifications: 

- YPD medium (1 % yeast extract, 2% peptone, 2 % dextrose) was inoculated with 10^ spores per ml 
45 and grown for 24 hours at 30 * C and 250 rpm 

- cells were harvested using Wero-Lene N tissue (No. 8011.0600 Wernii AG Verbandstoffabrik. 4852 
Rothrist. CH) and once washed with buffer (0.8 M KCt. 0.05 M CaCl2. in 0.01 M succinate buffer; pH 
5.5) 

- for protoplast preparation only lysing enzymes (SIGMA L-2265, St. Louis, MO. USA) were used 

50 - the cells were incubated for 90 min at 30* C and 100 rpm, and the protoplasts were separated by 
filtration (Wero-Lene N tissue) 

- the protoplasts were once washed with STC (1 M sorbitol. 0.05 M CaCl2. 0.01 M Tris/HCI pH 7.5) and 
resuspended in the same buffer 

- 150 ul protoplasts (-10^ /ml) were gently mixed with 10-15 ug plasmid DNA and incubated at room 
55 temperature (RT) for 25 min 

- polyethylene glycol (60% PEG 4000. 50 mMCaCb. 10 mM Tris/HCI pH 7.5) was added in three steps. 
150 ul. 200 ul and 900ul. and the sample was further incubated at room temperature (RT) for 25 min 
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- 5 ml STC were added, centrifuged and the protoplasts were resuspended in 2.5 nnl YGS (0.5% yeast 
extract. 2% glucose. 1.2 M sorbitol) 

- the sample was incubated for 2 hours at 30" C (100 rpm) centrifuged and the protoplasts were 
resuspended in 1 ml 1 .2 M sorbitol 

5 - the transformed protoplasts were mixed with 20 ml minimal regeneration medium (0.7% yeast 
nitrogen base without amino acids. 2% glucose. 1 M sorbitol. 1.5% agar. 20 mM Tris/HCI pH 7.5 
supplemented with 0.2 g arginine and 10 mg nicotinamide per liter) 

- the plates were incubated at 30 * C for 3-5 days 

10 B) Expression 

Single transformants were Isolated, purified and tested for overproduction of the fusion protein. 100 ml M25 
medium (70g maltodextrin (Glucidex 17D, Sugro Basel. CH), 12.5g yeast extract, 25g casein-hydrolysate, 
2g KH2PO4, 2g K2SO4, 0.5g MgSO*'»7H2 0. 0.03g ZnCb. 0.02g CaCb. 0.05g MnS04^4H2 0. 0.05g FeSO* 
/5 per liter pH 5.6) were inoculated with 10^ spores per ml from transformants FPAN1#11. #13. #16. #E25. 
#E30 respectively #E31 and incubated for 5 days at 30* C and 270 rpm. Supernatant was collected and the 
activity determined. The fusion protein showed the highest activity with phytic acid as substrate at pH 2.5, 
whereas with 4-nitrophenyl phosphate as substrate it showed twa activity optima at pH 2.5 and 5.0 (Table 
1). 

20 

0) Activity assay 

a) Phytic acid 

A 1 ml enzyme reaction contained 0.5 ml dialyzed supernatant (diluted if necessary) and 5.4 mf^ phytic 
25 acid (SIGMA P-3168). The enzyme reactions were made in 0.2 M sodium acetate buffer pH 5.0, 
respectively 0.2 M glycine buffer pH 2.5. The samples were incubated for 15 min at 37* C. The 
reactions were stopped by adding 1 ml 15% TCA (trichloroacetic acid). 

For the colour reaction 0.1 ml of the stopped sample was diluted with 0.9 ml destined water and mixed 
with 1 ml reagent solution (3 volumes 1 M H2SO4, 1 volume 2.5% (NH4)6M07 02* , 1 volume 10% 
30 ascorbic acid). The samples were incubated for 20 min at 50' C and the blue colour was measured 
spetrophotometrically at 820 nm. Since the assay is based on the release of phosphate a phosphate 
standard curve. 1 1 - 45 nmol per ml. was used to determine the activity of the samples. 

b) 4-nitrophenyl phosphate 

A 1 ml enzyme reaction contained 100 ul dialyzed supernatant (diluted if necessary) and 1.7 mM 4- 
35 nitrophenyl phosphate (Merck, 6850, Darmstadt, BRD). The enzyme reactions were made in 0,2 M 
sodium acetate buffer pH 5.0. respectively 0.2 M glycine buffer pH 2.5. The samples were incubated for 
15 min at 37* C. The reactions were stopped by adding 1 ml 15% TCA. 
For the determination of the enzyme activity the protocol described above was used. 

40 



45 



50 



55 
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TABLE 1 





SUBSTRATE 


Transformant 


' Phytic Acid 


* 4-Nitrophenyl phosphate 


pH 5.0 


pH 2.5 


pH 5.0 


pH 2.5 


A. niger 


0.2 


1 


1 


2 


FPAN1 U 1 1 


6 


49 


173 


399 


FPAN1 # 13 


2 


21 


60 


228 


FPAN1 # 16 


1 


16 


46 


153 


FPAN1 # E25 


3 


26 


74 


228 


FPAN1 # E30 


3 


43 


157 


347 


FPAN1 # E31 


3 


39 


154 


271 



' Units per ml: 1 unit = 1 umol phosphate released per min at 37 ' C 
not tranfornned 



Expression of the Aspergillus terreus 9A1 gene in Aspergillus r^iger 

A. niger NW205 was transformed with plasmid pPATI as described above. Single transformants were 
isolated, purified and screened for overproduction of the A, terreus protein. 50 ml YPD medium were 
inoculated with 10^ spores per ml from transformants PAT1#3. #10. #11. #13 and #16 and incubated for 3 
days at 30 ' C and 270 rpm. Supernatant was collected and the activity determined as described above 
except that the pH for the enzyme reactions were different. The enzyme showed its main activity at pH 5.5 
with phytic acid as substrate and at pH 3.5 with 4-nitrophenyl phosphate as substrate (Table 2). 

TABLE 2 





SUBSTRATE 


Transformant 


' Phytic Acid 


* 4-Nitrophenyl phosphate 


pH 5.5 


pH 3.5 


pH 5.5 


pH 3.5 


A. niger'* 


0 


0 


0 


0.1 


PATI # 3 


10 


0 


0.2 


0.7 


PATI # 10 


9 


0 


0.2 


0.8 


PATI # 1 1 


5 


0 


0.1 


0.5 


PATI # 13 


9 


0 


0.2 


0.7 


PATI # 16 


5 


0 


0.1 


0.5 



" Units per ml: 1 unit = 1 umoi phosphate released per min at 37* C 
not transformed 
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Example 10 

Fermentation of Aspergillus niger NW 205 transformants 
5 A) Transformant FPAN1#1 1 

Preculture medium [30 g mattodextrin (Glucidex 17D). 5 g yeast extract. 10 g casein-hydrolysate. 1 g 
KH2PO*. 0.5g MgS0**7H20, 3 g Tween 80 per liter; pH 5.5] was inoculated with 10^ spores per ml in a 
shake flask and incubated for 24 hours at 34 • C and 250 rpm. 

10 A 10 liter fermenter was inoculated with the pre-culture to a final dilution of the pre-culture of 1:100. The 
batch fermentation was run at 30- C with an automatically controlled dissolved oxygen concentration of 
minimum 25% (p02^25%). The pH was kept at 3.0 by automatic titration with 5 M NaOH. 
The medium used for the fermentation was: 35 g maltodextrin. 9.4 g yeast extract. 18.7 g casein- 
hydrolysate. 2 g KH2PO4, 0.5 g MgSO*-7H2 0, 2 g K2SO4. 0.03 g ZnCl2. 0.02 g CaCb. 0.05 g 

J5 MnSO*«4H20. 0.05 g FeS04 per liter; pH 5.6. 

Enzyme activities reached after 3 days under these conditions were 35 units/ml respectively 16 units/ml at 
pH 2.5 respectively pH 5.0 with phytic acid as substrate and 295 units/ml respectively 90 units/ml at pH 2.5 
respectively pH 5.0 and 4-nitrophenyl phosphate as substrate. 

20 B) Transformant PAT1#1 1 

Preculture. inoculation of the fermenter and the fermentation medium were as described above, except that 
the pH was kept at 4.5 by automatic titration with 5 M NaOH. "--^ 

Enzyme activities reached after 4 days under these conditions were 17.5 units/ml at pH 5.5 with phytic acid 
25 as substrate and 2 units/ml at pH 3.5 with 4-nitrophenyl phosphate as substrate. 

Example 1 1 

Isolation of PGR fragments of a phytase gene of Aspergillus terreus (CBS 220.95) 

30 

Two different primer pairs were used for PGR amplification of fragments using DNA of Aspergillus terreus . 
[GBS 220.95]. The primers used are shown in the Table below. 



Fragment amplified 


Primers 


Oligonucleotide sequences (5' to 3') 


8 plus 9 about 150 bp 


8 


ATGGA(Gn-)ATGTG(C/T)TG(N)TT(G/T)GA [SEQ ID N0:8] 




Amino acids 254-259: MDMCSF 


9 


TT(A/G)CC(A/G)GC(A/G)CC(G/A)TG(N)CC(A/G)TA [SEQ ID N0:9] 




Amino acids 296-301: YGHGAG 


10 plus 11 about 250 bp 


10 


TA(C/T)GC(N)GA(C/T)TT(Crr)TG(N)CA(C/T)GA(SEQ ID NO:10] 




Amino acids 349-354: YADFSH 


11 


CG(G/A)TC(G/A)TT(N)AC(N)AG(N)AC(N)C[SEQ ID N0:11] 




Amino acids 416-422: RVLVNDR 



DNA sequences in bold show the sense primer and in italics the antisense primer. The primers correspond 
50 to the indicated part of the coding sequence of the Aspergillus niger gene. The combinations used are 
primers 8 plus 9 and 10 plus 11. The Taq-Start antibody kit from Clontech (Palo Alto, OA, USA) was used 
according to the manufacturer's protocol. Primer concentrations for 8 plus 9 were 0.2 mM and for primers 
10 plus 11 one mM. Touch-down PGR was used for amplification [Don. R.H. et al. (1991), Nucleic Acids 
Res. 19. 4008]. First the DNA was denatured for 3 min at 95°G. Then two cycles were done at each of the 
55 followh^g annealing temperatures: 60°G, 59°C. 58°C, 57°C. 56°C, 55°G. 54°G. 53°C. 52°C and 51 °G. with an 
annealing time of one min. each. Prior to annealing the incubation was heated to 95°C for one min and after 
annealing elongation was performed for 30 sec at 72*'C. Cycles 21 to 35 were performed as follows: 
denaturation one min at 95°C. annealing one min at 50°C and elongation for 30 sec at 72°G. 
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Two different PGR fragments were obtained. The DNA sequences obtained and their comparison to relevant 
parts of the phytase gene of Aspergillus terreus 9A1 are shown in Figure 10 [relevant parts of the phytase 
gene of Aspergillus terreus 9A1 "9Ar(top lines) (1) and the PGR fragments of Aspergillus terreus CBS 
220.95 **aterr2l" (bottom lines). Panel A: Fragment obtained with primer pair 8 plus 9 (aterr2l). Panel B: 
Fragment obtained with primer pair 10 plus 11 (aterrSS). DNA sequences of Aspergillus terreus CBS 
220.95 (top lines) are compared with those of Aspergillus terreus 9A1 (1) (bottom lines). Panel A: The bold 
gc sequence (bases 16 plus 17) in the aterr2l fragment could possibly be eg (DNA sequencing 
uncertainty). Panel B: The x at position 26 of the aterrSS PGR fragment could possibly represent any of the 
four nucleotides]. 

Example 12 

Cross hybridizations under non-stringent and stringent washing conditions 

Five ug's of genomic DNA of each strain listed in Table 3 were incubated with 4 units of HindWl or Pst\/ 
respectively, per ug of DNA at 37°C for 4 hours. After digestion, the mixtures were extracted with phenol 
and DNAs were precipitated with ethanoL Samples were then analyzed on 0.8% agarose gels. DNAs were 
transferred to Nytran membranes (Schleicher & Schuell, Keene. NH, USA) using 0AM NaOH containing 1M 
NaCI as transfer solution. Hybridizations were performed for 18 hours at 42°C, The hybridization solution 
contained 50% formamide. 1% SDS. 10% dextran sulphate, 4 x SSPE (1 x SSPE = 0.18M NaCl, 1 mM 
EDTA, 10 mM NaH2P04. pH 7.4). 0.5% blotto (dried milk powder in H2O) and 0.5 mg salmon sperm DNA 
per ml. The membranes were washed under non-stringent conditions ^using as last and most-stringent 
washing condition incubation for 30 min at room temperature in 0,1 x SSPE containing 0.1% SDS. The 
probes (labelled at a specific activity of around 10^ dpm/ag DNA) used were the PGR fragments generated 
with primers 8 plus 9 (see Example 11) using genomic DNA of Myceliophthora thermophila; Mycelio. 
thermo., ; Aspergillus nidulans, Asperg. nidui: Aspergillus fumigatus, Asperg. fumig„\ Aspergillus 
terreus 9A1. Asperg. terreus 9A1. Talaromyces thermophilus, Talarom, thermo. The MT2 genomic 
probe was obtained by random priming (according to the protocol given by Pharmacia. Uppsala. Sweden) 
and spans 1410 bp. from the BspEI site upstream of the N-terminus of the Mycelio. thermo. phytase gen 
to the Pvull site in the G-terminus (positions 2068 to 3478). The AT2 genomic probe was obtained by 
random priming and spans 1365 bp, from the Apal site to the Ndel site of the Asperg. terreus 9A1 
phytase gene (positions 491 to 1856). The AN2 DNA probe was obtained by random priming and spans the 
complete coding sequence (1404 bp) of the Asperg. niger gene (EP 420 358). Results are given in Table 3. 
["•"except for weak signal corresponding to a non-specific 20kb fragment; In case of the very weak cross- 
hybridization signal at 20 kb seen with DNA from Aspergillus niger using the PGR fragment from 
Talaromyces thermophilus this signal is unspecific. since it differs significantly from the expected 10 kb 
Hindlll fragment, containing the phytase gene; signal due to only partical digest of DNA]. 
For cross-hybridizations with stringent washing conditions membranes were further washed for 30 min. at 
65 -G in 0.1 x SSPE containing 0.1% SDS. Results are shown In Table 4 only the 10.5-kb Hindlll 
fragment is still detected, the 6.5-kb Hindlll fragment disappeared (see table 3)]. 
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Table 3 



5 




PGR 
Probes 


Genomic 
Probes 


DNA 
Probes 


10 


Source of DNA 
U8«d for 
cro • »-hyb rizat ion 


Band 
(kb) 

detected 
with 
Probe of 
AspeTg, 
fumig. 


Band 
(kb) 

detected 
with 
Probe of 


Band 
(kb) 

detected 
with 
Probe of 

terreus 


Band 
(kb) 

detected 
with 
Probe of 
Mycelio* 
thermo. 


Band 
(kb) 

detected 
with 
Probe of 
TularoM, 
thermo. 


Band 
(kb) 

detected 

with 

geno- 

mic 

Probe 

MT2of 

Mycelio, 

thirmo. 


Band 
(kb) 

detected 
with 
geno- 
mic 
Probe 
AT2 of 
AMperg. 
terreus 

9A1 


Band 
(kb) 

detected 

with 

cDNA 

Probe 

AN2 of 

Asperg. 

niger 

(control) 


15 


Acrovhialovhora 
Ims fATCC 4S3801 


no 


no 


no 


no 


no 


8-kb 


no 


no 


20 


Aspergillus niger 
[ATCC 9142} 
(control) 


no 


no 


no 


no 


no* 


no 


no 


lOkb 
Hindlll 


Aspergillus Urreus 
[CBS 22035} 


no 


no 


il-kb 


no 


no 


no 


U-kb 

Hindlll 


no 


25 


Aspergillus sojae 
[CBS 221.95} 


no 


no 


no 


no 


no* 


no 


17-kb 

Hindlll 


no 


30 


Calcarisporiella 
thermophila 
(ATCC 22718} 


no 


no 


10.5-kb 

HindlW 


no 


no 


10.5-lcb 

HindlU 


lOJ-kb 
Hindlll 


no 




Chaetomium 
rectopilium 
[ATCC 22431 ; 


no 


no 


no 


no 


no 


>20-kb» 
Hindlll 


>20-kb»* 

Hindlll 


no 


35 


Corynascus 
th^rmophilus 
[ATCC 22066} 


no 


no 


no 


no 


no 


lO^kb 

Hindlll 


no 


no 


40 


Humicola sp, 
[ATCC 60849} 


no 


no 


no 


no 


no 


5i.kb 

HindlU 


no 


no 




Mycelia sterilia 
[ATCC 20350} 


no 


no 


no 


HindiW 


no 


^kb 
Hindlll 


6-kb 
Hindlll 


no 



45 



50 
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10 



15 



20 



25 



30 



35 



I Myrococcum 
I thirmophilum 
\IATCC 22112} 



j Rhizomucar mUhei 
UaTCC 22064 J 



I iporotrichum 
I cellulovhiium 
\IATCC 20494] 



i iporotrichum 
I tfurmqphiU 
[ATCC 22482 J 



bcyti 
J indonesicum 
[ATCC 46858} 



Aspergillus 
' mtus 
C 34625] 



13.kb 
HindlU 



Aspergillus nidulans 
\IDSM 9743} 



Aspergiiius terreuT 
\9A1 

IDSM 9076] 



\Myceliopkthora~ 

thh-TTtovhila 
\ I ATCC 48102 J 



3.8-kb 
Himflll 



no 



Hindin 

2.1/3.7- 
kbPstl 



no 



2.1/3.7- 



no 



Hindlll 



and 
lOS-kb 
HindlU 



9-kb 



no 



Him/III 



7a}aromyc£s 
iitermovnilus 
[ATCC 201861 



HindlU 



6.i>-kb liJ.S-kb 



^.5-kb 



5^kb 



6-kb 
and 
lO^-kb 
Hindlll 



^kb 



5J-kb 



lOi-kb 



63-kb 



40 



45 



50 



55 
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15 



20 



25 



IS 



30 



35 



o " 



o 
e 



40 



45 



SO 



«j — - 



2 < ^ c o 



o be 



8 < 



E ^ P ;2 I 



« i i 

s i ^ 



o -2 «i 

g VJ »^ 



^ ^ ^ 



2 ^"i 



«j .2 

< !a 

rr iNi 

Q 

o i 
t-» p 

3 ^ 
O 

CO 



ex. 



8 



S 



U 

to 



2?^ 



8 



ex. 



CM 

o 
E 



Vi 

C 

to 
CI 

s: 



s 



s 



o 



o 



« -2 



s 



oc 5s iS 



8 
>> 



o 
-s: 



.2 ^ 

5:5 



8 



CO 



55 



19 



10 



15 



20 



25 



EP 0 684 313 A2 



SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

<i) APPLICANT: 

(A) NAME: F. HOFFMANN-LA ROCHE AG 

(B) STREET: Grenzacheratraaae 124 

(C) CITY: Basle 

(D) STATE: BS 

(E) COUKTRY: Switzerland 

(F) POSTAL CODE (ZIP) : CH-4002 

(G) TELEPHONE: 061 - 688 25 05 

(H) TELEFAX: 061 - 688 13 95 

(I) TELEX: 962292/965542 hlr ch 

(ii) TITLE OF INVENTION: Polypeptides with phytaae activity 
(iii) NUMBER OF SEQUENCES: 21 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy diaJc 

(B) COMPUTER: Apple Macintosh 

(C) OPERATING SYSTEM: System 7.1 (Macintosh) 

(D) SOFTWAR£: WOrd 5.0 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: EP 94810228.0 

(B) FILING DATE: 25-APR-1994 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2327 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : doxible 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

35 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: join ( 37 4 . , 420 , 469. .1819) 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

TCTAGAACAA TAACAGGTAC TCCCTAGGTA CCCGAAGGAC CTTGTGGAAA ATGTATGGAG 60 

GTGGACACGG CACCAACCAC CACCCGCGAT GGCGCACGTG GTGCCCTAAC CCCTTGCTCC 120 

45 CTCAGGATGG AATCCATGTC GACTCTTTAC CCTCACCATC GCCTGGATGA AACCTCCCCG 180 

CTAAGCTCAC GACGATCGCT ATTTCCGACC GATTTGACCG TCATGGTGGA GGGCTGATTC 2 40 

GGTCGATGCT CCTGCCTTCA TTTCGGAGTT CGGAGACATG AAAGGCTTAT ATGAGGACGT 300 

50 CCCAGGTCGG GGACGAAATC CGCCCTGGGC TGTGCTCCTT CGTCGGAAAC ATCTGCTGTC 360 
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10 



40 



50 



CGTGATGGCT ACC ATG GGC TTT CTT GCC ATT GTG CTC TCC GTC GCC TTG 
H t Gly Phe Leu Ala lie Val Leu Ser Val Ala Leu 

CTC TTT AGA AG GTATGCACCC CTCTACGTCC AATTCTCTGG GCACTGACAA 
Leu Phe Azg Sec 
15 

CGGCGCAG C ACA TCG GGC ACC CCG TTG GGC CCC CGG GGC AAA CAT AGC 
Thr Ser Gly Thr Pro Leu Gly Pro Arg Gly Lys His Ser 



20 



65 



95 100 



GTC CGC GCC ACC GAT GCA TCC CGC GTC CAC GAA TCC GCC GAG AAG TTC 
val Arg Ala Thr Asp Ala Ser Arg Val His Glu Ser Ala Glu Lys Phe 
.160 155 



IT, ctn pro ser P« A^g 7.1 As^ vl! Ala lie Pro Glu Gly Ser Ala 



190 195 



TAC AAC AAC ACG CTG GAG CAC AGC CTC TGC ACC GCC TTC GAA TCC AGC 
Tyr Asn Asn Thr Leu Glu His Ser Leu Cys Thr Ala Phe Glu Ser Ser 
210 215 220 



409 



460 



508 



556 



GAC TGC AAC TCA GTC GAT CAC GGC TAT CAA TGC TTT CCT GAA CTC TCT 
ASP Cys Asn Ser Val Asp His Gly Tyr Gin Cys Phe Pro Glu Leu Ser 
30 35 40 45 

CAT AAA TGG GGA CTC TAC GCG CCC TAC TTC TCC CTC CAG GAC GAG TCT 604 
His Lys Trp Gly Leu Tyr Ala Pro Tyr Phe Ser Leu Gin Asp Glu Ser 
50 55 

CCG TTT CCT CTG GAC GTC CCA GAG GAC TGT CAC ATC ACC TTC GTG CAG 
20 Pro Phe Pro Leu Asp Val Pro Glu Asp Cys His He Thr Phe Val Gin 



652 



700 



GTG CTG GCC CGC CAC GGC GCG CGG AGC CCA ACC CAT AGC AAP_ACC AAG 
val Leu Ala Arg His Gly Ala Arg Ser Pro Thr His Ser Lys Thr Lys 
80 85 90 

GCG TAC GCG GCG ACC ATT GCG GCC ATC CAG AAG AGT GCC ACT GCG TTT 748 
Ala Tyr Ala Ala Thr He Ala Ala He Gin Lys Ser Ala Thr Ala Phe 



796 



844 



CCG GGC AAA TAC GCG TTC CTG CAG TCA TAT AAC TAC TCC TTG GAC TCT 
30 pro Gly Lys Tyr Ala Phe Leu Gin Ser Tyr Asn Tyr Ser Leu Asp Ser 

110 115 120 ^''^ 

GAG GAG CTG ACT CCC TTC GGG CGG AAC CAG CTG CGA GAT CTG GGC GCC 
Glu Glu Leu Thr Pro Phe Gly Arg Asn Gin Leu Arg Asp Leu Gly Ala 
130 135 1*" 

CAG TTC TAC GAG CGC TAC AAC GCC CTC ACC CGA CAC ATC AAC CCC TTC 892 
Gin Phe Tyr Glu Arg Tyr Asn Ala Leu Thr Arg His He Asn Pro Phe 
145 150 155 



940 



988 



GTC GAG GGC TTC CAA ACC GCT CGA CAG GAC GAT CAT CAC GCC AAT CCC 
val Glu Gly Phe Gin Thr Ala Arg Gin Asp Asp His HiS Ala Asn Pro 
175 180 185 

CAC CAG CCT TCG CCT CGC GTG GAC GTG GCC ATC CCC GAA GGC AGC GCC 1036 



1084 
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30 



40 



50 



335 " 340 345 



IT. rlr ITr Val cT« OXn ..r .s, O.y Tyr .la .Xa 

385 39° 

GCC TGG ACG GTG CCG TTC GCC GCT CGC GCG TAG GTC GAG ATG ATG CAG 
l% rTr vll Pro Phe Ala Ala Arg Ala Tyr Val Glu Met Met Gin 
400 



CGG GAC GCT TTC GTC GCG GGG CTG AGC TTT GCG CAG 5^ 5?^ 

Arg ASP Ala Phe Val Ala Gly Leu Ser Phe Ala Gin Ala Gly Gly Asn 
450 



1180 



1228 



1276 



ACC GTC GGC GAC GAC GCG GTC GCC AAC TTC ACC GCC GTG TTC GCG CCG 1132 
Thr val Gly Aap Aap Ala Val Ala Asn Phe Thr Ala Val Phe Ala F 

225 230 ^-^^ 

GCG ATC GCC CAG CGC CTG GAG GCC GAT CTT CCC GGC GTG CAG CTG TCC 
Sa lie Ala Gin Arg I^u Glu Ala Aap Leu Pro Gly Val Gin Leu Ser 
240 245 

ACC GAC GAC GTG GTC AAC CTG ATG GCC ATG TGT CCG TTC GAG ACG GTC 
Thr A3P A5P val val Asn Leu Met Ala Met Cys Pro Phe Glu Thr Val 
'° 255 260 265 

AGC CTG ACC GAC GAC GCG CAC ACG CTG TCG CCG TTC TGC GAC CTC TTC 
ser Leu Thr Asp Asp Ala His Thr Leu Ser Pro Phe Cys Asp Leu Phe 
270 275 

ACG GCC ACT GAG TGG ACG CAG TAC AAC TAC CTG CTC TCG CTG GAC AAG 1324 
Sa ihr Thr Gin Tyr Asn Tyr Leu Leu Ser Leu Asp Lys 

290 

TAC TAC GGC TAC GGC GGG GGC AAT CCG CTG GGT CCG GTG CAG GGG GTC 
20 Tyr Tyr Gly Tyr Gly Gly Gly Asn Pro Leu Gxy Pro Val Gin Gly 

C3GC TGG GCG AAC GAG CTG ATG GCG CGG CTA ACG CGC GCC CCG GTG CAC 
Gly Trp Ala Asn Glu Leu Met Ala Arg Leu Thr Arg Ala Pro Val His 
320 325 

GAC CAC ACC TGC GTC AAC AAC ACC CTC GAC GCG AGT CCG GCC ACC TTC 
Asp His Thr Cys Val Asn Asn Thr Leu Asp Ala Ser Pro Ala Thr Pne 



1372 



1420 



1516 



1564 



rrr ctC AAC GCC ACC CTC TAC GCC GAC TTC TCC CAC GAC AGC AAC CTG 
P^o Su Sa ?5r 2u Tyr Ala Asp Phe Ser His Asp Ser Asn Leu 

350 355 360 

GTG TCG ATC TTC TGG GCG CTG GGC CTG TAC AAC GGC ACC GCG CCG CTG 
val Ser He Phe Trp Ala Leu Gly Leu Tyr Asn Gly Thr Ala Pro Leu 
370 375 380 

" TCG CAG ACC TCC GTC GAG AGC GTC TCC CAG ACG GAC GGG TAC GCC GCC 



1660 



1708 



TGT CGC GCC GAG AAG GAG CCG CTG GTG CGC GTG CTG GTC AAC GAC CGG 
IZ ^9 Oil Glu Pro Leu Val Arg Val Leu Val Asn Asp Arg 

43^5 420 

GTC ATG CCG CTG CAT GGC TGC CCT ACG GAC AAG CTG GGG CGG TGC AAG 1^5 6 

vll nil Pro 2u His Gly Cys Pro Thr Asp Lys Leu Gly Arg Cys Lys 
430 435 440 



1804 
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1859 



10 



15 



25 



TGG GCG GAT TGT TTC TGATGTTGAG AAGAAAGGTA GATAGATAGG TAGTACATAT 
Trp Ala Asp Cys Phe 
465 

GGATTGCTCG GCTCTGGGTC GTTGCCCACA ATGCATATTA CGCCCGTCAA CTGCCTTGCG 1919 

CCATCCACCT CTCACCCTGG ACGCAACCGA GCGGTCTACC CTGCACACGG CTTCCACCGC 197 9 
GACGCGCACG GATAAGGCGC TTTTGTTACG GGGTTGGGGC TGGGGGCAGC CGGAGCCGGA . 203 9 

GAGAGAGACC AGCGTGAAAA ACGACAGAAC ATAGATATCA ATTCGACGCC AATTCATGCA 2099 

GAGTAGTATA CAGACGAACT GAAACAAACA CATCACTTCC CTCGCTCCTC TCCTGTAGAA 2159 

GACGCTCCCA CCAGCCGCTT CTGGCCCTTA TTCCCGTACG CTAGGTAGAC CAGTCAGCCA 2219 

TCACAAGAAC GGGGGCGGGG GACACACTCC GCTCGTACAG CACCCACGAC 227 9 



GACGCATGCC 



GTGTACAGGA AAACCGGCAG CGCCACAATC GTCGAGAGCC ATCTGCAG 

20 (2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 466 amino acida 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Gly Phe Leu Ala He Val Leu Ser Val Ala Leu Leu Phe Arg Ser 
30 1 5 10 

Thr Ser Gly Thr Pro Leu Gly Pro Arg Gly Lys His Ser Asp Cys Asn 
20 25 30 

Ser val Asp His Gly Tyr Gin Cys Phe Pro Glu Leu Ser His Lys Trp 
35 35 40 45 

Gly Leu Tyr Ala Pro Tyr Phe Ser Leu Gin Asp Glu Ser Pro Phe Pro 
50 55 60 

Leu ASP val Pro Glu Asp Cys His lie Thr Phe Val Gin Val Leu Ala 
40 65 70 "75 

Arg His Gly Ala Arg Ser Pro Thr His Ser Lys Thr Lys Ala Tyr Ala 
85 90 

Ala Thr He Ala Ala He Gin Lys Ser Ala Thr Ala Phe Pro Gly Lys 
45 100 105 110 



Tyr Ala Phe Leu Gin Ser Tyr Asn Tyr Ser Leu Asp Ser Glu Glu Leu 

115 120 125 

Thr Pro Phe Gly Arg Asn Gin Leu Arg Asp Leu Gly Ala Gin Phe Tyr 

50 135 140 
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10 



15 



25 



30 



35 



40 



45 



50 



Glu Arg Tyr Aan Ala Leu Thr Arg His lie Aan Pro Phe Val Arg Ala 
145 150 155 160 

Thr Asp Ala Ser Arg Val His Glu Ser Ala Glu Lya Phe Val Glu Gly 
165 170 175 

Phe Gin Thr Ala Arg Gin Asp Asp His His Ala Asn Pro His Gin Pro 
180 185 190 

Ser Pro Arg Val Asp Val Ala lie Pro Glu Gly Ser Ala Tyr Asn Asn 
195 200 205 

Thr Leu Glu His Ser Leu Cys Thr Ala Phe Glu Ser Ser Thr Val Gly 
210 215 220 

Asp Asp Ala Val Ala Asn Phe Thr Ala Val Phe Ala Pro Ala lie Ala 
225 230 235 240 

Gin Arg Leu Glu Ala Asp Leu Pro Gly Val Gin Leu Ser Thr Asp Asp 
245 250 255 

Val Val Asn Leu Met Ala Met Cys Pro Phe Glu Thr Val Ser Leu Thr 
260 265 270 

Asp Asp Ala His Thr Leu Ser Pro Phe Cys Asp Leu Phe Thr Ala Thr 

275 280 -.^285 

Glu Trp Thr Gin Tyr Asn Tyr Leu Leu Ser Leu Asp Lys Tyr Tyr Gly 
290 295 300 

Tyr Gly Gly Gly Asn Pro Leu Gly Pro Val Gin Gly Val Gly Trp Ala 
305 310 315 320 

Asn Glu Leu Met Ala Arg Leu Thr Arg Ala Pro Val His Asp His Thr 
325 330 335 

Cys Val Asn Asn Thr Leu Asp Ala Ser Pro Ala Thr Phe Pro Leu Asn 
340 345 350 

Ala Thr Leu Tyr Ala Asp Phe Ser His Asp Ser Asn Leu Val Ser lie 
355 360 365 

Phe Trp Ala Leu Gly Leu Tyr Asn Gly Thr Ala Pro Leu Ser Gin Thr 
370 375 380 

Ser Val Glu Ser Val Ser Gin Thr Asp Gly Tyr Ala Ala Ala Trp Thr 
385 390 395 400 

Val Pro Phe Ala Ala Arg Ala Tyr Val Glu Met Met Gin Cys Arg Ala 
405 410 415 

Glu Lys Glu Pro Leu Val Arg Val Leu Val Asn Asp Arg Val Met Pro 
420 425 430 

Leu His Gly Cys Pro Thr Asp Lys Leu Gly Arg Cys Lys Arg Asp Ala 
435 440 445 

Phe Val Ala Gly Leu Ser Phe Ala Gin Ala Gly Gly Asn Trp Ala Asp 
450 455 460 
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Cys Phe 
465 

(2) INFORMATION FOR SEQ ID NO : 3: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3995 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: join (220 8 2263, 2321.. 3725) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: ^ 



20 



25 



45 



GTCGACGAGG 


CACACCACGC 


CCGTCCTCGG 


CGGGTCCGAG 


AGGGCCGGGC 


TCGGGTTCGA 


60 


CAAGGAGACG 


GGCGTCCCTT 


CGGGCGCGGC 


TGCGGGTGTG 


(SGTGTTGCTG 


TGGACGGTGA 


120 


GGAGGGGGAC 


GGGCTGGGCG 


TTGATGACGG 


TACGAATGCG 


AACGGACACii -GGCCGCTGAG 


180 


CGTGGGTGTT 


GCGTTCTAAT 


CTTTCTTTGT 


GTGGGTGTGT 


ACGTGTGGGT 


GTGTATGTGT 


240 


TTGGGGGGGG 


GAATGTTCTT 


GGTAATTATC 


TTTCTACCCT 


TCTTCTCTTT 


CCTTTATTCT 


300 


GTTCAGCAGG 


TATACCCCGT 


GTAAGTGTAC 


AGGATTATGG 


GACGGGTGGG 


TGGATGGACT 


360 


ACTTCTAGAA 


GGACGGATAA 


GGAAAAAGGG 


GAAACACGAA 


TATGGCGCCC 


TGGGTGGCGC 


420 


GTCGAGCTGG 


ATGCTTGACG 


CCGGTCTGGC 


AAACATTTTC 


TTCTTCTAGC 


ACCCAACCTA 


480 


GTACTTGATA 


GAGTGTTTCG 


GGGCCAGGCG 


GTTTGCGCTG 


TGTTTTTACC 


AATCACCAAC 


540 


TAGTGCTACT 


ACTATTATTG 


CGGCTGTTGA 


TGCAGCCGTG 


TACCAAAAAT 


GCCGCGGCAT 


600 


CTCCATTGAT 


ACTTGTAGTT 


TTGATAGATC 


AATATTTGGG 


AGGTTGCGCT 


GGGCTGCTCT 


660 


GAAACCCCTC 


TCTCTTGCTG 


TACGTAACGT 


ATGTGCACAG 


TATGTCACCG 


ACAAAGACGA 


120 


TTGCATGCGC 


ATCGTTTTTT 


GTTGTGTTTC 


AGGCCTCGCT 


CGTGTCTAGG 


GTATAAACAC 


780 


ATTGAAGACT 


ACATATGCGC 


AAGACGTTGA 


CATTAACGGG 


GTCCTGCAGC 


CGCCGCAGGT 


840 


GCATGTCGTG 


ATTAATACCA 


CGCGCCTGCG 


TAAATTAGCT 


AGCCGCCGCC 


CTGTTTCACT 


900 


CGGTTAGAGA 


CGGACAGGTG 


AGACGGGTCT 


CGGTTAAGCA 


AGCAAATTGG 


AATGCAAGGT 


960 


TGAAGGTGTA 


ATCTGCATAG 


CGTGGAAATG 


AGAGGGCTCT 


GTGGGCAGCC 


AGGAAGGTGA 


1020 


GACGAAATGA 


GGAAAGAGGC 


ACCAGAAGCT 


GTTGTTCTGA 


AGTGCCCGTG 


GTCATAGCTC 


1080 


CAGGATTAAG 


TACGGATGTC 


CCATGCCAAG 


CTGCTGGCTT 


CGAAAGCGAG 


TACGGAGTAG 


1140 


TGTCCATTGT 


TCACGAGGGA 


TCCCCAATGT 


GTTAGACATG 


CCTGAATCAA 


TTTTGTCCTA 


1200 



55 



25 



EP 0 684 313 A2 



10 



15 



TTTTTGGATT TCAACTGTTT CTCTCGACTG TGCTCGGTAG CGACTATGCC GCAAGGTACA 12 60 

CTACATGTTG TACAATAATC ATACATCGAC CTTCCGTAGG AGTGCTGAAA TACCCGACCT 1320 

GCTCTCTCTA GCAGGTGCCT AATGGCTTTC GTGTAACTCG ATCGAAACGG ATCAGCAAGT 1380 

CCATTTGCTG TTGGTTGAGA TGTACGATTT ACAAACACGT GGAGAGGTGA GCCACAGCGA 1440 

TAGGCTTCTG GAAGGATTCT GGCGTCTCGG AAAGAGGGCC ACTCGCCCCA CTAACCGGCG 1500 

CCGATCTTGA CATGGGGCTC GCAGGGGGTT TAAGTGCACA CTACGGAGTA CGGATTACAC 1560 

AGTAGTGTAT GGGTGGGGGC GAGTTTGGGT GGCCTTGTGT GGGGCTCACC GGCTGCCTGT 1620 

TCTCGGGGAG TCTTGGCGGG CCGATTGGAC CCACCTAACC ACGGGTAGTC TTGGCCCGGC 1680 

CAACTCACAC CGCCCTCATG TTTCGGAGCC AGTCAGGGAG GCAGGCACTA CTCAGTCAGG 17 40 

TACACACGTC GGGCTCCTCG ATGCTGGGTG ACATCGAGGC GATACTGCAT TCCAACTACG 1800 

GTTGGCATAG GAGGTATCCT ATTCTAGAGC TGTTCTACGC CGGAACGTAA CCCGGGATAA 18 60 

CCCGGGATAT CGCTTCCCTG AGCGAGCGCG CTGCTGAGGA TCATACAACC CAACAACCGA 1920 

CGACGGTGCA AGAAGGTTGG GGGAAGGAAG AAATCAAGGA AAAAAAAATA GGGGGGGTGG 1980 

GGACCAAGAG AGAAAGAAAG GAGAAAAGGG TGGGGGGAGG GAAGAGAAAA AAAAAACGGA 2 04 0 

GGAATATGGC GTCGCTCTTC GACTGGTTCC GGAAGGGGGC ATCTGGGTAC ACATATGCAC 2100 

CTCTTCCGCA CGGCAGGGAT ATAAACCGGG AGTGCAGTCC CACCGATCAT GCTGAGTCCG 2160 

CCCGTCTCCA GACTTCACGG TCGCAGAGGA CTAGACGCGC GGTGAAG ATG ACT GGC 2 216 
30 Met Thr Gly 

1 

CTC GGA GTG ATG GTG GTG ATG GTC GGC TTC CTG GCG ATC GCC TCT CT 2263 
Leu Gly Val Met Vai Val Met Val Gly Phe Leu Ala He Ala Ser Leu 
5 10 15 



2C 



25 



35 



40 



45 



50 



GTAAGCAGCG ATTCCAGGGG TCCGGTGTGC GTTAAAAGAA AAAGCTAACG CCACCAG A 2 321 

CAA TCC GAG TCC CGG CCA TGC GAC ACC CCA GAC TTG GGC TTC CAG TGT 2 369 

Gin Ser Glu Ser Arg Pro Cya Asp Thr Pro Asp Leu Gly Phe Gin Cys 
20 25 30 35 

GGT ACG GCC ATT TCC CAC TTC TGG GGC CAG TAG TCG CCC TAG TTC TCC 2 417 

Gly Thr Ala He Ser His Phe Trp Gly Gin Tyr Ser Pro Tyr Phe Ser 
40 45 50 

GTG CCC TCG GAG CTG GAT GCT TCG ATC CCC GAC GAC TGC GAG GTG ACG 2 4 65 

Val Pro Ser Glu Leu Asp Ala Ser He Pro Asp Asp Cys Glu Val Thr 
55 60 €5 

TTT GCC CAA GTC CTC TCC CGC CAC GGC GCG AGG GCG CCG ACG CTC AAA 2 513 

Phe Ala Gin Val Leu Ser Arg His Gly Ala Arg Ala Pro Thr Leu Lys 
70 75 80 

CGG GCC GCG AGC TAG GTC GAT CTC ATC GAC AGG ATC CAC CAT GGC GCC 2 561 
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Arg Ala Ala Ser Tyr Val Asp Leu lie Aap Arg lie Hia Hia Gly Ala 
85 90 95 

ATC TCC TAG GGG CCG GGC TAG GAG TTC CTC AGG ACG TAT GAG TAG ACG 2 60 9 

lie Ser Tyr Gly Pro Gly Tyr Glu Phe Leu Arg Thr Tyr Asp Tyr Thr 
100 105 110 115 

GTG GGC GCG GAC GAG CTG ACG CGG ACG GGC CAG CAG CAG ATG GTC AAC 2 657 

Leu Gly Ala Asp Glu Leu Thr Arg Thr Gly Gin Gin Gin Met val Asn 
120 125 130 

TCG GGC ATC AAG TTT TAG GGC CGG TAG GGC GCT CTC GCG GGC AAG TCG 2705 
Ser Gly lie Lys Phe Tyr Arg Arg Tyr Arg Ala Leu Ala Arg Lys Ser 
135 140 145 

ATG CGG TTC GTG GGC ACC GCG GGC CAG GAC GGC GTC GTG GAC TCG GCG 2753 
lie Pro Phe Val Arg Thr Ala Gly Gin Asp Arg Val Val His Ser Ala 
150 155 160 

GAG AAC TTC ACC CAG GGC TTC GAC TCT GCG CTG GTC GCG GAC CGC GGG 2 801 

Glu Asn Phe Thr Gin Gly Phe His Ser Ala Leu Leu Ala Asp Arg Gly 
165 170 175 

TCG AGG GTC CGG CGC ACC CTC GGC TAT GAC ATG GTG GTG ATC CCG GAA 28 49 

Ser Thr Val Arg Pro Thr Leu Pro Tyr Asp Met Val Val lie Pro Glu 
180 - 185 190 195 

25 ACG GCG GGC GCC AAC AAC ACG CTC GAC AAC GAG CTC TGC ACC GCG TTC 2 8 97 

Thr Ala Gly Ala Asn Asn Thr Leu His Asn Aap Leu Cys Thr Ala Phe 
200 205 210 

GAG GAA GGC CCG TAG TCG ACC ATC GGC GAC GAC GCC GAA GAC ACC TAC 2 9 45 

Glu Glu Gly Pro Tyr Ser Thr He Gly Aap Asp Ala Gin Asp Thr Tyr 
30 215 220 225 

GTG TCG ACC TTC GCC GGA GCC ATC ACC GCC CGG GTG AAC GCC AAC CTG 2 9 93 

Leu Ser Thr Phe Ala Gly Pro He Thr Ala Arg Val Asn Ala Asn Leu 
230 235 240 

^5 CCG GGC GCC AAC CTG ACG GAG GCC GAC ACG GTG GCG CTG ATG GAC CTC 30 41 

Pro Gly Ala Asn Leu Thr Asp Ala Asp Thr Val Ala Leu Met Asp Leu 
245 250 255 



20 



40 



TGC CCG TTC GAG ACG GTC GCC TCC TCC TCC TCG GAC GCG GCA ACG GCG 308 9 

Cys Pro Phe Glu Thr Val Ala Ser Ser Ser Ser Asp Pro Ala Thr Ala 
260 265 270 275 

GAC GCG GGG GGC GGC AAC GGG CGG CGG CTG TCG CGC TTC TGC CGC CTG 3137 
Asp Ala Gly Gly Gly Asn Gly Arg Pro Leu Ser Pro Phe Cys Arg Leu 
280 285 290 

TTC AGC GAG TCC GAG TGG CGC GCG TAC GAC TAC CTG CAG TCG GTG GGC 318 5 

Phe Ser Glu Ser Glu Trp Arg Ala Tyr Asp Tyr Leu Gin Ser Val Gly 
295 300 305 



50 



AAG TGG TAC GGG TAC GGG CCG GGC AAC CCG CTG GGG CCG ACG CAG GGG 3233 
Lys Trp Tyr Gly Tyr Gly Pro Gly Asn Pro Leu Gly Pro Thr Gin Gly 
310 315 320 



55 



27 



EP 0 684 313 A2 



10 



IS 



GTC GGG TTC GTC AAC GAG CTG CTG GCG CGG CTG GCC GGG GTC CCC GTG 
Vai Gly Phe Vai Aan Glu Leu Leu Ala Arg Leu Ala Gly Val Pro Val 
325 330 



335 



CGC GAC GGC ACC AGO ACC AAC CGC ACC CTC GAC GGC GAC CCG CGC ACC 
Arg Aap Gly Thr Ser Thr Aan Arg Thr Leu Aap Gly Aap Pro Arg Thr 
340 345 350 - — 



355 



TTC CCG CTC GGC CGG CCC CTC TAC GCC GAC TTC AGO CAC GAC AAC GAC 
Phe Pro Leu Gly Arg Pro Leu Tyr Ala Aap Phe Ser Hia Aap Aan Aap 

365 370 

ATG ATG GGC GTC CTC GGC GCC CTC GGC GCC TAC GAC GGC GTC CCG CCC 
Met Met Gly Val Leu Gly Ala Leu Gly Ala Tyr Aap Gly Val Pro Pro 
375 380 385 

CTC GAC AAG ACC GCC CGC CGC GAC CCG GAA GAG CTC GGC GGG TAC GCG 
Leu Aap Lya Thr Ala Arg Arg Aap Pro Glu Glu Leu Gly Gly Tyr Ala 

395 400 



ACTGGCGAAA TTCAAGTCTG GGGCCTGCGG CGTCTGCATT CTCCGTTCCC TGTTGTTACC 
TTCTTAATGG TTTTTTTTTA TTTTTTATTT TTCTTAAATT TTCACACAAA CCTTTTATTG 
TCTTTTTTTC TTCTTTTTCT TCTTCTGCAC ATCGGATGGG AATTGTCGAC 3 995 



3377 



3425 



3473 



20 c I ^AC GTC GAG AAG ATG 3521 

20 Ala Ser Trp Ala Val Pro Phe Ala Ala Arg He Tyr Val Glu Lya Met 

405 410 43^5 



3569 



3617 



CGG TGC AGC GGC GGC GGC GGC GGC GGC GGC GGC GGC GAG GGG CGG CAG 
Arg Cya Ser Gly Gly Gly Gly Gly Gly Gly Gly Gly Glu Gly Arg Gin 

GAG AAG GAT GAG GAG ATG GTC AGG GTG CTG GTG AAC GAC CGG GTG ATG 
Glu Lya Aap Glu Glu Met Val Arg Val Leu Val Aan Aap Arg Val Met 
440 445 

^0 11^ ^ GGG ATG TGT ACG CTA GAA 3665 

Thr Leu Lya Gly Cya Gly Ala Aap Glu Arg Gly Met Cya Thr Leu Glu 

460 465 

CGG TTC ATC GAA AGC ATG GCG TTT GCG AGG GGG AAC GGC AAG TGG GAT 3713 
Arg Phe lie Glu Ser Met Ala Phe Ala Arg Gly Aan Gly Lya T^ Asp 
35 ° 475 480 

Leu C^ p" SI ACGCCCGAGA TTGAACAGAA CTTGTGATGG 37 65 

485 

GGGTAGAGTG TGGTATTCGA GATGATAGTT CACAGTTTTC GGGAATCAAA AATCGGTTAG 3825 



3885 
3945 



3281 



3329 -ij 



(2) INFORMATION FOR SEQ XD NO : 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 87 amino acida 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Thr Gly Leu Gly Val M t Val Val Met Val Gly Phe Leu Ala lie 
15 10 15 

Ala Ser Leu Gin Ser Glu Ser Arg Pro Cys Aap Thr Pro Aap Leu Gly 
20 25 30 

Phe Gin Cys Gly Thr Ala lie Ser His Phe Trp Gly Gin Tyr Ser Pro 
35 40 45 

Tyr Phe Ser Val Pro Ser Glu Leu Aap Ala Ser lie Pro Aap Aap Cya 
50 55 €0 

?5 Glu Val Thr Phe Ala Gin Val Leu Ser Arg Hia Gly Ala Arg Ala Pro 

65 70 75 80 

Thr Leu Lys Arg Ala Ala Ser Tyr Val Aap Leu lie Aap Arg lie Hia 
85 90 95 

20 Hia Gly Ala He Ser Tyr Gly Pro Gly Tyr Glu Phe Leu Arg Thr Tyr 

100 105 110 

Aap Tyr Thr Leu Gly Ala Aap Glu Leu Thr Arg Thr Gly Gin Gin Gin 
115 12 0 125 

25 Met Val Aan Ser Gly He Lya Phe Tyr Arg Arg Tyr Arg Ala Leu Ala 

130 135 140 

Arg Lya Ser He Pro Phe Val Arg Thr Ala Gly Gin Aap Arg Val Val 
145 150 155 160 

30 Hia Ser Ala Glu Aan Phe Thr Gin Gly Phe Hia Ser Ala Leu Leu Ala 

165 170 175 

Aap Arg Gly Ser Thr Val Arg Pro Thr Leu Pro Tyr Aap Met Val Val 
180 185 190 

35 He Pro Glu Thr Ala Gly Ala Aan Aan Thr Leu Hia Aan Asp Leu Cya 

195 200 205 

Thr Ala Phe Glu Glu Gly Pro Tyr Ser Thr He Gly Aap Aap Ala Gin 
210 215 220 

40 Aap Thr Tyr Leu Ser Thr Phe Ala Gly Pro He Thr Ala Arg Val Aan 

225 230 235 240 

Ala Aan Leu Pro Gly Ala Aan Leu Thr Aap Ala Aap Thr Val Ala Leu 
245 250 255 

45 Met Aap Leu Cya Pro Phe Glu Thr Val Ala Ser Ser Ser Ser Aap Pro 

260 265 270 

Ala Thr Ala Aap Ala Gly Gly Gly Aan Gly Arg Pro Leu Ser Pro Phe 
275 280 285 



50 



Cya Arg Leu Phe Ser Glu Ser Glu Trp Arg Ala Tyr Aap Tyr Leu Gin 
290 295 300 
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Ser Val Gly Lys Trp Tyr Gly Tyr Gly Pro Gly Asn Pro Leu Gly Pro 
305 310 315 320 

Thr Gin Gly Val Gly Phe Val Asn Glu Leu Leu Ala Arg Leu Ala Gly 
325 330 335 

Val Pro Val Arg Asp Gly Thr Ser Thr Asn Arg Thr Leu Asp Gly Asp 
340 345 350 

Pro Arg Thr Phe Pro Leu Gly Arg Pro Leu Tyr Ala Asp Phe Ser His 
355 360 365 

Asp Asn Asp Met Met Gly Val Leu Gly Ala Leu Gly Ala Tyr Asp Gly 
370 375 380 

Val Pro Pro Leu Asp Lys Thr Ala Arg Arg Asp Pro Glu Glu Leu Gly 
385 390 395 400 

Gly Tyr Ala Ala Ser Trp Ala Val Pro Phe Ala Ala Arg lie Tyr Val 
405 410 ^ 415 

Glu Lys Met Arg Cys Ser Gly Gly Gly Gly Gly Gly Gly Gly Gly Glu 
420 425 430 

Gly Arg Gin Glu Lys Asp Glu Glu Met Val Arg Val Leu Val Asn Asp 
435 440 445 

Arg Val Met Thr Leu Lys Gly Cys Gly Ala Asp Glu Arg Gly Met Cys 
450 455 460 

Thr Leu Glu Arg Phe lie Glu Ser Met Ala Phe Ala Arg Gly Asn Gly 
465 470 475 480 

Lys Trp Asp Leu Cys Phe Ala 
485 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 100 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

G ACC TTG GCT CGC AAC CAC ACA GAC ACG CTG TCT CCG TTC TGC GCT 

Thr Leu Ala Arg Asn His Thr Asp Thr Leu Ser Pro Phe Cys Ala 

15 10 15 



30 
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CTT TCC ACG CAA GAG GAG TGG CAA GCA TAT GAC TAG TAG CAA AGT CTG 94 
Leu Ser Thr Gin Glu Glu Trp Gin Ala Tyr Asp Tyr Tyr Gin Ser Leu 
20 25 

100 

GGG AAT 
Giy Aan 



10 



75 



20 



25 



30 



35 



(2) INFORMATION FOR SEQ ID NO : 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Thr Leu Ala Arg Asn His Thr Asp Thr Leu Ser Pro^ Phe Cys Ala Leu 
1 5 10 

ser Thr Gin Glu Glu Trp Gin Ala Tyr Asp Tyr Tyr Gin Ser Leu Gly 
20 25 30 



Asn 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 106 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 106 



40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7: 

T ACG GTA GCG CGC ACC AGC GAC GCA AGT CAG CTG TCA CCG TTC TGT 
vlx Sa Arg Thr Ser Asp Ala Ser Gin Leu Ser Pro Phe Cys 
1 5 10 



CAA CTC TTC ACT CAC AAT GAG TGG AAG AAG TAC AAC TAC CTT CAG TCC 
G^^ "u pSe Thr His Asn Glu Trp Lys Lys Tyr Aan Tyr Leu Gin Ser 
20 25 30 



TTG GGC AAG TAC 
Leu Gly Lys Tyr 
50 35 



46 



94 



106 



55 



31 



EP 0 684 313 A2 



70 



75 



20 



25 



30 



40 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acida 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Thr Val Ala Arg Thr Ser Asp Ala Ser Gin Leu Ser Pro Phe Cys Gin 
15 10 15 

Leu Phe Thr His Asn Glu Trp Lys Lys Tyr Asn Tyr Leu Gin Ser Leu 
20 25 30 

Gly Lys Tyr 
35 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 9 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 109 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9: 



C ACC ATG GCG CGC ACC GCC ACT CGG AAC CGT AGT CTG TCT CCA TTT 4 6 

35 Ala Arg Thr Ala Thr Arg Asn Arg Ser Leu Ser Pro Phe 

15 10 15 



TGT GCC ATC TTC ACT GAA AAG GAG TGG CTG CAG TAC GAC TAC CTT CAA 
Cys Ala lie Phe Thr Glu Lys Glu Trp Leu Gin Tyr Asp Tyr Leu Gin 
20 25 30 



94 



TCT CTA TCA AAG TAC 10 9 

Ser Leu Ser Lys Tyr 
35 

45 (2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

50 

(ii) MOLECULE TYPE: protein 
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70 



15 



20 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Thr Met Ala Arg Thr Ala Thr Arg Aan Arg Ser Leu Ser Pro Phe Cys 
15 10 15 

Ala lie Phe Thr Glu Lys Glu Trp Leu Gin Tyr Asp Tyr Leu Gin Ser 
20 25 30 

Leu Ser Lys Tyr 
35 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1912 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1396 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1398 



30 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATG GGC GTC TCT GCT GTT CTA CTT CCT TTG TAT CTC CTA GCT GGA GTC 4 8 

Met. Gly Val Ser Ala Val Leu Leu Pro Leu Tyr Leu Leu Ala Gly Vai 
15 10 15 

ACC TCC GGA CTG GCA GTC CCC GCC TCG AGA AAT CAA TCC ACT TGC GAT 9 6 

Thr Ser Gly Leu Ala Val Pro Ala Ser Arg Asn Gin Ser Thr Cys Asp 
20 25 30 

ACG GTC GAT CZAA GGG TAT CAA TGC TTC TCC GAG ACT TCG CAT CTT TGG 14 4 

Thr Val Asp Gin Gly Tyr Gin Cys Phe Ser Glu Thr Ser His Leu Trp 
35 40 45 

GGT CAA TAC GCG CCG TTC TTC TCT CTG GCA AAC GAA TCG GTC ATC TCC 192 
^0 Gly Gin Tyr Ala Pro Phe Phe Ser Leu Ala Asn Glu Ser Val lie Ser 

50 55 60 

CCT GAT GTG CCC GCC GGT TGC AGA GTC ACT TTC GCT CAG GTC CTC TCC 24 0 

Pro Asp Val Pro Ala Gly Cys Arg Val Thr Phe Ala Gin Val Leu Ser 
65 70 75 80 

45 

CGT CAT GGA CJCG CGG TAT CCG ACC GAG TCC AAG GGC AAG AAA TAC TCC 28 8 

Arg His Gly Ala Arg Tyr Pro Thr Glu Ser Lys Gly Lys Lys Tyr Ser 

85 90 95 

GCT CTC ATT GAG GAG ATC CAG CAG AAC GTG ACC ACC TTT GAT GGA AAA 33 6 

50 Ala Leu lie Glu Glu lie Gin Gin Asn Val Thr Thr Phe Asp Gly Lys 

100 105 110 
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75 



20 



25 



35 



45 



TAT GCC TTC CTG AAG ACA TAG AAC TAG AGC TTG GGT GGA GAT GAG CTG 
Tyr Ala Phe Leu Lys Thr Tyr Asn Tyr Ser Leu Gly Ala Asp Asp Leu 

120 125 



ACT CCC TTC GGA GAg' CAG GAG CTA GTC AAC TCC GGC ATC AAG TTC TAG 
Thr Pro Phe Gly Glu Gin Glu Leu Val Aan Ser Gly He Lys Phe Tyr 

135 140 

ff*^ ""^^ TTC GTC GGC GCC 480 

Gin Arg Tyr Aan Ala Leu Thr Arg His He Asn Pro Phe Val Arg Ala 

150 155 



Gin Arg Leu Glu Ala Aap Leu Pro" ol^ vll cTn L^u S« Thr A^p A^p 
245 250 255 

vll v!? f"""^ """^ "^"^ ""^^ CCG TTC GAG ACG GTC AGC CTG ACC 

val Val Asn Leu Met Ala Met Cys Pro Phe Glu Thr Val Ser Leu Thr 

265 270 

GAC GAC GCG CAC ACG CTG TCG CCG TTC TGC GAC CTC TTC ACG GCC ACT 
A3P A3P Ala Hi3 Thr Leu Ser Pro Phe Cy. Aap Leu Phe ?Sr S Jhr 
275 280 285 



?Jr ^ ^ ^ "^^^ C'^C ^ <^TC GGC TGG GCG 

Tyr Gly Gly Gly Asn Pro Leu Gly Pro Val Gin Gly Val Gly Trp Ala 

310 315 320 

AAC GAG CTG ATG GCG CGG CTA ACG CGC GCC CCC GTG CAC GAC CAC ACC 
A3n Glu Leu Met Ala Arg Leu Thr Arg Ala Pro Val His Sp nil Jhr 
325 330 335 



384 



432 



Th^ *r ff* '^^^ '■CC GCC GAG AAG TTC GTC GAG GGC 528 

Thr Asp Ala Ser Arg Val His Glu Ser Ala Glu Lys Phe Val Glu Gly 
165 170 

pJe G?J iSr SI f * f**" "^^^ ^^'^ CCC CAC CAG CCT 57 6 

Phe Gin Thr Ala Axg Gin Asp Aap His His Ala Aan Pro His Gin Pro 

180 185 190 

TCG CCT CGC GTG GAC GTG GCC ATC CCC GAA GGC AGC GCC TAC AAC AAC 62 4 

Ser Pro Arg Val Asp Val Ala He Pro Glu Gly Ser Ala Tyr Asn Aan 
155 200 205 

T^^ ^°C ACC GCC TTC GAA TCC AGC -ACC GTC GGC 672 

Thr Leu Glu His Ser Leu Cys Thr Ala Phe Glu Ser Ser Thr Val Gly 

215 220 

GAC GAC GCG GTC GCC AAC TTC ACC GCC GTG TTC GCG CCG GCG ATC GCC 720 
Asp ASP Ala Val Ala Asn Phe Thr Ala Val Phe Ala Pro Ala lie Ala 

230 235 240 

G?S Su f?f f^! ^. =J? S*^ CTG TCC ACC GAC GAC 



816 



864 



gJu ?Sr I'''' ^"^^ GGC 912 

Glu rp Thr Gin Tyr Asn Tyr Leu Leu Ser Leu Asp Lys Tyr Tyr Gly 

295 300 



960 



1008 



r w ? AGT CCG GCC ACC TTC CGG CTG AAC 105 6 

Cys val Asn Asn Thr Leu Asp Ala Ser Pro Ala Thr Phe Pro Leu J^n 

345 350 
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GCC ACC CTC TAG GCC GAG TTG TGC CAC GAG AGC AAC CTG GTG TCG ATC 
Ala Thr Leu Tyr Ala Aap Phe Ser His Asp Ser Asn Leu Val Ser 
355 360 3" 

5 TTG rrCG GGG GTG GGG CTG TAG AAG GGC ACG GCG CCG CTG TCG CAG ACC 

Phe Trp Ala Leu Gly Leu Tyr Asn Gly Thr Ala Pro Leu Ser Gin Thr 
370 375 380 

TGC GTC GAG AGC GTG TCG CAG ACG GAC GGG TAG GCC GCC GGC TGG ACG 
ser val Glu Ser Val Ser Gin Thr Asp Gly Tyr Ala Ala Ala Trp t 
JO 390 395 

CTG CCG TTG GGC GCT GGC GGG TAG GTC GAG ATG ATG GAG TGT GGC GCC 
vll pro lie Sa Ma Ala Tyr Val Glu Met Met Gin Gys Ar| Ala 



/5 



405 



GAG AAG GAG CCG GTG GTG CGC GTG GTG GTC AAC GAC CGG GTC ATG CCG 
Glu Lys Glu Pro Leu Val Arg Val Leu Val Asn Asp Arg Val Met Pro 
420 425 

CTG CAT GGC TGG GCT ACG GAC AAG CTG GGG CGG TGG AAG CGG GAC GCT 
2u S C?s Pro Thr Asp Lys Leu Gly Arg Cys Lys Arg Asp Ala 

20 440 

TTG GTG GGG GGG CTG AGC TTT GCG CAG ^^G GGG GGG AAC TGG GCG GAT 
Phe val Ala Gly Leu Ser Phe Ala Gin Ala Gly Gly Asn Trp.^la Asp 
450 455 
" TGT TTG TGATGTTGAG AAGAAAGGTA GATAGATAGG TAGTACATAT GGATTGCTGG 

Cys Phe 
465 

GCTCTGGGTC GTTGCCCACA ATGCATATTA CGCCCGTCAA CTGCCTTGCG CCATCCACCT 
CTCACCCTGG ACGCAACCGA GCGGTCTACC CTGCACACGG CTTCCACCGC GACGCGCACG 
GATAAGGCGC TTTTGTTACG GGGTTGGGGC TGGGGGCAGC CGGAGCCGGA GAGAGAGACC 
AGCGTGAAAA ACGACAGAAC ATAGATATCA ATTCGACGCC AATTCATGCA GAGTAGTATA 
CAGACGAACT GAAACAAACA CATCACTTCC CTCGCTCCTC TCCTGTAGAA GACGCTCCCA 
CCAGCCGCTT CTGGCCCTTA TTCCCGTACG CTAGGTAGAC CAGTCAGCCA GACGCATGCC 
TCACAAGAAC GGGGGCGGGG GACACACTCC GCTCGTACAG CACCCACGAC GTGTACAGGA 
AAACCGGCAG CGCCACAATC GTCGAGAGCC ATCTGCAGGA ATTC 

(2) INFORMATION FOR SEQ ID NO: 12:. 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 66 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



1104 
1152 - 

1200 

1248 

1296 

1344 

1392 

1448 

1508 

1568 

1628 

1688 

1748 

1808 

1868 

1912 



55 



35 



EP 0 684 313 A2 



Met Gly Val S«i: Ala Val Leu Leu Pro Leu Tyr Leu Leu Ala Glv Val 

Thr Ser Gly Leu Ala Val Pro Ala Ser Arg Aan Gin Ser Thr Cy, Asp 
2° 25 30 

Thr val Asp Gin Gly Tyr Gin Cys Phe Ser Glu Thr Ser His Leu Trp 
35 40 45 

Gly Gin Tyr Ala Pro Phe Phe Ser Leu Ala Asn Glu Ser Val lie Ser 
50 55 60 

Pro A3P Val Pro Ala Gly Cys Arg Val Thr Phe Ala Gin Val Leu Ser 
" ■'0 75 80 

Arg His Gly Ala Arg Tyr Pro Thr Glu Ser Lya Gly Lys Lys Tyr Ser 
85 90 55 

Ala Leu lie Glu Glu He Gin Gin Asn Val Thr Thr Phe Asp Gly Lys 

105 

Tyr Ala Phe Leu Lys Thr Tyr Asn Tyr Ser Leu Gly Ala Asp Asp Leu 
115 120 125 

Thr Pro Phe Gly Glu Gin Glu Leu Val Asn Ser Gly He Lya Phe Tyr 
- 135 140 



Gin Arg Tyr Asn Ala Leu Thr Arg His lie Asn Pro Phe Val Arg Ala 

150 155 160 

Thr Asp Ala Ser Arg Val His Glu Ser Ala Glu Lys Phe Val Glu Gly 
165 170 j^75 

Phe Gin Thr Ala Arg Gin Asp Asp His His Ala Asn Pro His Gin Pro 

185 190 

Ser Pro Arg Val Asp Val Ala He Pro Glu Gly Ser Ala Tyr Asn Asn 
195 200 205 

Thr Leu Glu His Ser Leu Cys Thr Ala Phe Glu Ser Ser Thr Val Gly 
^1° 215 220 

Asp Asp Ala Val Ala Asn Phe Thr Ala Val Phe Ala Pro Ala He Ala 

230 235 240 

Gin Arg Leu Glu Ala Asp Leu Pro Gly Val Gin Leu Ser Thr Asp Asp 
245 250 255 

val Val Asn Leu Met Ala Met Cys Pro Phe Glu Thr Val Ser Leu Thr 

265 270 

ASP ASP Ala His Thr Leu Ser Pro Phe Cys Asp Leu Phe Thr Ala Thr 
275 280 285 

Glu Trp Thr Gin Tyr Asn Tyr Leu Leu Ser Leu Asp Lys Tyr Tyr Gly 

295 300 

Tyr Gly Gly Gly Asn Pro Leu Gly Pro Val Gin Gly Val Gly Trp Al 

310 315 37 



a 

320 
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A3n Glu Leu M t 



Cys Val Asn Asn 
340 



Ala Thr Leu Tyr 
355 

Phe Trp Ala Leu 
370 

Ser Val Glu Ser 
385 

Val Pro Phe Ala 



Glu Lya Glu Pro 
420 



Leu Hia Gly Cys 
435 

Phe Val Ala Gly 
450 



Ala Arg Leu Thr 

325 

Thr Leu Asp Ala 



Ala Asp Phe Ser 
360 



Gly Leu Tyr Aan 
375 

Val Ser Gin Thr 
390 

Ala Arg Ala Tyr 
405 

Leu Val Arg Val 



Pro Thr Asp Lys 
440 

Leu Ser Phe Ala 
455 



Arg Ala Pro Val 
330 

Ser Pro Ala Thr 
345 

His Asp Ser Asn 



Gly Thr Ala Pro 
380 



Asp Gly Tyr Ala 
395 

val Glu Met Met 
410 

Leu Val Asn Asp 
425 

Leu Gly Arg Cys 



Gin Ala Gly Gly 
460 



His Asp His Thr 

335 

Phe Pro Leu Asn 
350 

Leu Val Ser lie 
365 

Leu Ser Gin Thr 



Ala Ala Trp Thr 
400 

Gin Cys Arg Ala 
415 

Arg Val Met Pro 
430 

Lys Arg Asp Ala 
445 

Asn Trp Ala Asp 



Cys Phe 
465 



(2) INFORMATION FOR SEQ ID NO: 13: 



<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 112 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GACGGTCAGC CTGACCGACG ACGCGCACAC GCTGTCGCCG TTCTGCGACC TCTTCACCGC 
CGCCGAGTGG ACGCAGTACA ACTACCTGCT CTCGCTGGAC AAGTACTACG TC 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 90 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CAGTAACCTG GTGTCGATCT TCTGGNCGCTG GGTCTGTACA ACGGCACCAA GCCCCTGTCG 
CAGACCACCG TGGAGGATAT CACCCGGACG 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genoioic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
ATGGAYATGT GYTCNTTYGA 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 16: 
TTRCCRGCRC CRTGNCCRTA 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRA2^DEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TAYGCNGAYT TYTCNCAYGA 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 
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JO 



20 



25 



30 



(C) STRANDEDNESS: single^ 

(D) TOPOLOGY: linear 

(il) MOLECULE TYPE: DNA (genoniic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18; 
CGRTCRTTNA CNAGNACNC 
(2) INFORMATION FOR SEQ ID NO: 19: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

'5 (c) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO; 

ATGGAYATGT GYTCNTTYGA 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
(B> TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

TTRCCRGCRC CRTGNCCRTA 
(2) INFORMATION FOR SEQ ID NO: 21: 

40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

^5 (ii) MOLECULE TYPE: DNA (genoraic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
AGTCCGGAGG TGACTCCAGC TAGGAGATAC 



19 



20 



20 



30 



55 Clain)s 
1, 



A DNA sequence coding for a polypeptide having phytase activity and which DNA sequence is derived 
from a fungus selected from the group consisting of Acrophialophora levis. Aspergillus terreus. 
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TaUromyces Ihormophilus. ' ^^P^'S'l'i'S 'umgalus. Aspergillus nidufans and 

sa,«:d^r:^::lr" = - o« sad„en„ . 

«. « O.. s.,„e„ea .... s. a ,...a„. o?,:Trs:Z^::r.d ,„ ,a, „ 

' ^."crirr ,rrr' - ^"^^'^ ^--^ P^v..s, ac„vl,v a„d s,«nce IS 

(a) me DNA sequence of Figure 2 IS6Q ID NfliT „, . 

(b) a DNA sequence which hybridi'.s under s^frn """"'""""'V 

<c, a DNA sequence which, becau!: ouh^^eg "'""^ 

^=i^s«-.Ss~^i^^^^ 

. A ONA ' ' " - or ,c, 

=e'ec"t,"r;?,*17'= ° """"""^ '""''V and which DNA sequence is 

sequences of (a) or (b) but which co,es 7oflTZ^!l\^'''''^^ 

ID NO.M] ,solat3b.e from Aspergillus terreus (cis S^os?' °' '^^^V ^^^^ '° '^^'^^ and/or SEQ 
variant or aequivalent thereof. * ^ °' '^^'Ch DNA sequence is a degenerate 

• A DNA sequence as claimed in any one of claims 4 to r k u 

act.v.ty Which DNA sequence is derived from a rngul ' '^--3 P^ytase 

A DNA sequence according to claim 3 wherein the fungus is selected from a group as de.ed .n Ca.m 

• ^;:i=s:^Si;— : ~h DNA sequence 

K wuui^i OT a KOR reaction with DNA 
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isolated from a fungus as defined in any one of claims 1 to 3 and the following pair of PGR primer: 
"ATGGA(Cn-)ATGTG(C/T)TC{N)TT(C/T)GA" [SEQ ID N0:15] as sense primer and 
*^TT(A/G)CG{A/G)GC(A/'G)CC{G/A)TG{N)CC(A/G)TA" [SEQ ID N0:16] as anti-sense primer. 

5 11. A DNA sequence which codes for a polypeptide having phytase activity and which DNA sequence 
hybridizes under standard conditions with a probe which is a product of a PGR reaction with DNA 
isolated from Aspergillus terreus (CBS 220.95) and the following two pairs of PGR primers: 
(a) "ATGGA(C/r)ATGTG(Grr)TG(N)TT(G/T)GA" [SEQ ID N0:15] as the sense primer and 
'*TT(A/G)GG(A/G)GC(A/G)GC{G/A)TG(N)CC(A/G)TA" [SEQ ID N0:16] as the anti-sense primer; and 
10 (b) "TA(Gn-)GG(N)GA(Cn-)TT(CyT)TG(N)CA(CniGA" [SEQ ID NO: 17] as the sense primer and 

"CG(G/A)TC(G/A)TT(N)AC(N)AG(N)AG(N)G" [SEQ ID NO: 18) as the anti-sense primer. 

12. A DNA sequence coding for a chimeric construct having phytase activity which chimeric construct 
comprises a fragment of a DNA sequence as claimed in any one of claims 1 to 1 1 . 

75 

13. A DNA sequence coding for a chimeric construct as defined in claim 12 which chimenc construct . 
consists at its N-terminal end of a fragment of the Aspergillus niger phytase fused at its C-terminal end 
to a fragment of the Aspergillus terreus phytase, 

20 14. A DNA sequence as claimed in claim 13 with the specific nucleotide sequence as shown in Figure 7 
[SEQ ID N0:11] and a degenerate variant or aequivalent thereof. 

15. A DNA sequence as claimed in any one of claims 1 to 14 wherein the encoded polypeptide is a 
phytase. 



25 



16. A polypeptide encoded by a DNA sequence as claimed in any one of claims i to 15. 

17. A vector comprising a DNA sequence as claimed in any one of claims 1 to 1 5. 

30 18. A vector as claimed in claim 17 suitable for the expression of said DNA sequence in bacteria or a 
fungal or a yeast host. 

19. Bacteria or a fungal or yeast host transformed by a DNA sequence as claimed in any one of claims 1 
to 15 or a vector as claimed in claim 17 or 18. 
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20. A composit food or feed comprising one or more polypeptides as defined in claim 16. 



21. A process for the preparation of a polypeptide as claimed in claim 16 characterized in that transformed 
bacteria or host cell as claimed in claim 19 is cultured under suitable culture conditions and the 

40 polypeptide is recovered therefrom. 

22. A polypeptide when produced by a process as claimed in claim 21. 

23. A process for the preparation of a composit feed or food wherein the components of the composition 
45 are mixed with one or more polypeptides as defined in claim 16. 

24. A process for the reduction of levels of phytate in animal manure characterized in that an animal is fed 
a composit feed as defined in claim 20 in an amount effective in converting phytate contained in the 
feedstuff to inositol and inorganic phosphate. 

25. Use of a polypeptide according to claim 16 for the conversion of phytate to inositol phosphates, inositol 
and inorganic phosphate. 
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Fig, 1/1 

tctagaacaataacaggtactccctaggtacccgaaggaccttgtggaaaatgtatggag 60 

gtggacacggcaccaaccaccacccgcgatggcgcacgtggtgccctaaccccttgctcc 120 

ctcaggatggaatccatgtcgactctttaccctcaccatcgcctggatgaaacctccccg 180 

ctaagctcacgacgatcgctatttccgaccgatttgaccgtcatggtggagggctgattc 2 40 

ggtcgatgctcctgccttcatttcggagttcggagacatgaaaggcttatatgaggacgt 300 

cccaggtcggggacgaaatccgccctgggctgxgctccttcgtcggaaacatctgctgtc 3 60 

cgtgatggctaccatgggctttcttgccattgtgctctccgtcgccttgctctttagaag 420 

MGFLAIVLSVALLFRS 16 

gtatgcacccctctacgtccaattctctgggcactgacaacggcgcagcacatcgggcac 480 

T S G T 20 

cccgttgggcccccggggcaaacatagcgactgcaactcagtcgatcacggctatcaatg 540 

PLGPRGKHSDCNSVDHGYQC 40 

ctttcctgaactctctcataaatggggactctacgcgccctacttctccctccaggacga 600 

FPELSHKWGLYAPYFSLQDE €0 

gtctccgtttcctctggacgtcccagaggactgtcacatcaccttcgtgc-aggtgctggc 660 

SPFPLDVPEDCHITFVQVLA 80 

ccgccacggcgcgcggagcccaacccatagcaagaccaaggcgtacgcggcgaccattgc 72 0 

RHGARSPTHSKTKAYAAT I^^A 100 

ggccatccagaagagtgccactgcgtttccgggcaaatacgcgttcctgcagtcatataa 7 80 

AIQKSATAFPGKYAFLQSYN 120 

ctactccttggactctgaggagctgactcccttcgggcggaaccagctgcgagatctggg 8 40 

YSLDSEELTPFGRNQLRDLG 140 

CQcccagttctacgagcgctacaacgccctcacccgacacatcaaccccttcgtccgcgc 90 0 

AQFYERYNALTRHINPFVRA 160 

caccgatgcatcccgcgtccacgaatccgccgagaagttcgtcgagggcttccaaaccgc 96 0 

TDASRVHESAEKFVEGFQTA 180 

tcgacaggacgatcatcacgccaatccccaccagccttcgcctcgcgtggacgtggccat 102 0 

RQDDHHANPHQPSPRVDVAI 200 

ccccgaaggcagcgcctacaacaacacgctggagcacagcctctgcaccgccttcgaatc 108 0 

PEGSAYNNTLEHSLCTAFES 220 

cagcaccgt:cggcgacgacgcggtcgccaacttcaccgccgtgttcgcgccggcgatcgc 1140 

STVG.DDAVANFTAVFAPAIA 240 

ccagcgcctggaggccgatcttcccggcgtgcagctgtccaccgacgacgtggtcaacct 1200 

QRLEADLPGVQLSTDDVVNL 260 

gatggccatgtgtccgttcgagacggtcagcctgaccgacgacgcgcacacgctgtcgcc 12 60 

MAMCPFETVSLTDDAHTLSP 280 

gttctgcgacctcttcacggccactgagtggacgcagtacaactacctgctctcgctgga 1320 

FCDLFTATEWTQYNYLLSLD 300 

caagtactacggctacggcgggggcaatccgctgggtccggtgcagggggtcggctgggc 1380 

KYYGYGGGNPLGPVQGVGWA 320 

gaacgagctgatggcgcggctaacgcgcgcccccgtgcacgaccacacctgcgtcaacaa 1440 

NELMARLTRAPVHDHTCVNN 340 

caccctcgacgcgagtccggccaccttcccgctgaacgccaccctctacgccgacttctc 150 

TLDASPATFPLNATLYADFS 36 



0 
360 
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Fig. 1/2 

ccacgacagcaacctggtgtcgatcttctgggcgctgggcctgtacaacggcaccgcgcc 1560 
HDS NL VSIFWALGLYNGTAP 380 

gctgtcgcagacctccgtcgagagcgtctcccagacggacgggtacgccgccgcctggac 1620 
L S Q T S V E S V S Q T D G y A A A W T 400 



c 1680 
420 



ggtgccgttcgccgctcgcgcgtacgtcgagatgatgcagtgtcgcgccgagaaggagc 
VPFAARAYVEMMQCRAEKEP 

gctggtgcgcgtgctggtcaacgaccgggtcatgccgctgcatggctgccctacggacaa 1740 
L V R V L V N D R V M P L H G C P T D K 440 

gctggggcggtgcaagcgggacgctttcgtcgcggggctgagctttgcgcaggcgggcgg 1800 
LGRCKRDAFVAGLSFAQAGG 460 

gaactgggcggattgtttctgatgttgagaagaaaggtagatagataggtagtacatatg 1860 
N W A D C F 

gattgctcggctctgggtcgttgcccacaatgcatattacgcccgtcaactgccttgcgc 1920 
catccacctctcaccctggacgcaaccgagcggtctaccctgcacacggcttccaccgcg 1980 
acgcgcacggataaggcgcttttgttacggggttggggctgggggcagccggagccggag 2040 
agagagaccagcgtgaaaaacgacagaacatagatatcaattcgacgccaattcatgcag 2100 
agtagtatacagacgaactgaaacaaacacatcacttccctcgctcctctcctgtagaag 2160 
acgctcccaccagccgcttctggcccttattcccgtacgctaggtagaccagtcagccag 2220 
acgcatgcctcacaagaacgggggcgggggacacactccgctcgtacagcacccacgacg 2280 
tgtacaggaaaaccggcagcgccacaatcgtcgagagccatctgcag 
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gtcgacgaggcacaccacgcccgtcctcggcgggtLCcgagagggccgggctcgggttcga 60 

caaggagacgggcgtcccttcgggcgcggctgcgggtgtgggtgttgctgtggacggtga 120 

ggagggggacgggctgggcgttgatgacggtacgaatgcgaacggacacaggccgctgag 180 

cgtgggtgttgcgttctaatctttctttgtgtgggtgtgtacgtgtgggtgtgtatgtgt 240 

ttgggggggggaatgttcttggtaattatctttctacccttcttctctttcctttattct 300 

gttcagcaggtataccccgtgtaagtgtacaggattatgggacgggtgggtggatggact 360 

acttctagaaggacggataaggaaaaaggggaaacacgaatatggcgccctgggtggcgc 420 

gtcgagctggatgcttgacgccggtctggcaaacattttcttcttctagcacccaaccta 4 80 

gtacttgatagagtgtttcggggccaggcggtttgcgctgtgtttttaccaatcaccaac 540 

tagl:gctactactattattgcggctgttgatgcagccgtgt.accaaaaatgccgcggcat 600 

ctccattgatacttgtagttttgatagatcaatatttgggaggttgcgctgggctgctct 6 60 

gaaacccctctctcttgctgtacgtaacgtatgtgcacagtatgtcaccgacaaagacga 720 

ttgcatgcgcatcgttttttgttgtgtttcaggcctcgctcgtgtctagggtataaacac 780 

attgaagactacatatgcgcaagacgttgacattaacggggtcctgcagccgccgcaggt 840 

gcatgtcgtgattaataccacgcgcctgcgtaaattagctagccgccgccctgtttcact 900 

cggttagagacggacaggtgagacgggtctcggttaagcaagcaaattggaatgcaaggt. 960 

tgaaggtgtaatctgcatagcgtggaaatgagagggctctgtgggcagccaggaaggtga 1020 

gacgaaatgaggaaagaggcaccagaagctgttgttctgaagtgcccgtggtcatagctc 1080 

caggattaagtacggatgtcccatgccaagctgctggcttcgaaagcgagtacggagtag 1140 

tgtccattgttcacgagggatccccaatgtgttagacatgcctgaatcaattttgtccta 1200 

tttttggatttcaactgtttctctcgactgtgctcggtagcgactatgccgcaaggtaca 12 60 

ctacatgttgtacaataatcatacatcgaccttccgtaggagtgctgaaatacccgacct 1320 

gctctctctagcaggtgcctaatggctttcgtgtaactcgatcgaaacggateagcaagt 1380 

ccatttgctgttggttgagatgtacgatttacaaacacgtggagaggtgagccacagcga 14 40 

taggcttctggaaggattctggcgtctcggaaagagggccactcgccccactaaccggcg 1500 

ccgatcttgacatggggctcgcagggggtttaagtgcacactacggagtacggattacac 15 60 

agtagtgtatgggtgggggcgagtttgggtggccttgtgtggggctcaccggctgcctgt 1620 

tctcggggagtcctggcgggccgattggacccacctaaccacgggtagtcttggcccggc 1680 

caactcacaccgccctcatgtttcggagccagtcagggaggcaggcactactcagtcagg 17 40 

tacacacgtcgggctcctcgatgctgggtgacatcgaggcgatactgcattccaactacg 1800 

gtLtggcataggaggtatcctattctagagctgttctacgccggaacgtaacccgggataa 18 60 

cccgggatatcgcttccctgagcgagcgcgctgctgaggatcatacaacccaacaaccga 1920 

cgacggtgcaagaaggctgggggaaggaagaaatcaaggaaaaaaaaatagggggggtgg 1980 

ggaccaagagagaaagaaaggagaaaagggtggggggagggaagagaaaaaaaaaacgga 20 40 

ggaatatggcgtcgctcttcgactggttccggaagggggcatctgggtacacatatgcac 2100 

ct cttccgcacggcagggatataaaccgggagtgcagtcccaccgatcatgctgagtccg 2160 

cccgtctccagacttcacggtcgcagaggactagacgcgcggtgaagatgactggcctcg 2220 

M T G L G 5 

gagtgatggtggtgatggtcggcttcctggcgatcgcctctctgtaagcagcgattccag 22 80 

VMVVMVGFLAIASL 19 

gggtccggtgtgcgttaaaagaaaaagctaacgccaccagacaatccgagtcccggccat 23 40 

QSESRPC 26 

gcgacaccccagacttgggcttccagtgtggtacggccatttcccacttctggggccagt 2 4 00 
DTPDLGFQCGTAISHFWGQY 46 

actcgccctacttctccgtgccctcggagctggatgcttcgatccccgacgactgcgagg 2 4 60 
SPYFSVPSELDASIPDDCEV 66 

tgacgtttgcccaagtcctctcccgccacggcgcgagggcgccgacgctcaaacgggccg 2520 
TFAQVLSRHGARAPTLKRAA 86 

cgagctacgtcgatctcatcgacaggatccaccatggcgccatctcctacgggccgggct 25 80 
SYVDLIDRIHHGAISYGPGY106 

acgagttcctcaggacgtatgactacaccctgggcgccgacgagct cacccggacgggcc 2 6 40 
EFLRTYDYTLGADELTRTGQ 126 

agcagcagatggtcaactcgggcatcaagttttaccgccgctaccgcgctctcgcccgca 2700 
QQMVNSGIKFYRRYRALARK146 
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„!,a,a.,ac.,..caca,.».=,,,aat=,a.^^^^ 

!^:n??i:rtn!r.rc:n:ru"raraca.a=i.....«,tc....c..=.tc.. 35|o 

t;l"tc<:tctgoacat=gg.C953»attgtc,ac 
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Fig. 4 

gaccttggctcgcaaccacacagacacgctgtctccgctctgcgc^^^^ 
;;;;n;;;ag;;t;g^^^ct;tgc;aclgaggcaagacgcgagaaag 
T L A R N H T D T L S P F C A L S T Q E 

ggagtggcaagcatatgactactaccaaagtctggggaa^ 
cctclccgttcgtatactgatgatggtttcagaccccttt 
EWQAYDYYQSLGN 
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Fig. 5 

tacggtagcgcgcaccagcgacgcaagtcagctgtcaccgttctgtcaactc^ 

;;;;;;;;rcgcg;;Scg^;rcSt;;g^;gacagtggcaagacagtt 
TVARTSDASQLS>t^ = 

caatgagtggaagaagtacaactaccttcagtccttgggcaagt^ 
gttactclccttcttcatg^tgatggaagtcaggaacccgttcatg 
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caccatggcgcgcaccgccactcggaaccgtagtctgtctccattttgtgccatcttcac 

1 + + + + + + 60 

gtggtaccgcgcgtggcggtgagccttggcatcagacagaggt-aaaacacggtagaagtg 
TMARTATRNRSLSPFCAIFT 

tgaaaaggagtggctgcagtacgactaccttcaatctctatcaaagtac 
51 ^ + + + 109 

acttttcctcaccgacgrcatgctgatggaagttagagatagtttcatg 
EKEWLQYDYLQSLSKY 
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Fig, 111 

atgggcgtctctgctgttctacttcctttgtatctcctagctggagtcacctccggactg 

^ ^ J ^ ^ + + + 60 

tacccgcagagacgacaagatgaaggaaacatagaggatcgacctcagtggaggcctgac 
MGVSAVLLPLYLLAGVTSGL 

gcagtccccgcctcgagaaatcaatccacttgcgatacggtcgatcaagggtatcaatgc 

61 ^ + ^ + + ^ 120 

cgtcaggggcggagctctttagttaggtgaacgctatgccagctagttcccatagttacg 
AVPASRNQSTCDTVDQGYQC 

ttctccgagacttcgcatctttggggtcaatacgcgccgttcttctctctggcaaacgaa 

X21 ^ ^ ^ ^ ^ 

aagaggctcrgaagcgtagaaaccccagttatgcgcggcaagaagagagaccgtttgctt 

FSETSHLWGQYAPFFS LANE 

tcggtcatctcccctgatgtgcccgccggttgcagagtcactttcgctcaggtcctctcc 

181 ^ — ^— ^ + 240 

agccagtagaggggactacacgggcggccaacgtctcagtgaaagcgagtccaggagagg 
SVISPDVPAGCRVTFAQVLS 

cgtcatggagcgcggtatccgaccgagtccaagggcaagaaacactccgctctcattgag 

241 ^ ^ + + ^ + 

gcagtacctcgcgccataggctggctcaggttcccgttctttatgaggcgagagtaactc 

RHGARYPTESKGKKYSALIE 

gagatccagcagaacgtgaccacctttgatggaaaatatgccttcctgaagacatacaac 

301 ^ + + + + 360 

ctctaggtcgcctcgcactggtggaaactaccttttatacggaaggacttctgtatgttg 

EIQQNVTTFDGKYAFLKTYN 

tacagcttgggtgcagatgacctgactcccttcggagagcaggagctagtcaactccggc 

361 ^ ^ ^ + ^ + 420 

atgtcgaacccacgtctactggactgagggaagcctctcgtcctcgatcagttgaggccg 
YSLGADDLTPFGEQELVNSG 

atcaagttctaccagcgctacaacgccctcacccgacacatcaaccccttcgtccgcgcc 

421 + + ^ + + ^ 

tagttcaagatggtcgcgatgttgcgggagtgggctgtgtagttggggaagcaggcgcgg 

IKFYQRYNALTRHINPFVRA 

accgatgcatcccgcgtccacgaatccgccgagaagttcgtcgagggcttccaaaccgct: 

481 ^ + + * * 540 

tggctacgtagggcgcaggtgctt.aggcggctcttcaagcagctcccgaaggt:ttggcga 
TDASRVHESAEKFVEGFQTA 

cgacaggacgatcatcacgccaatccccaccagccttcgcctcgcgtggacgtggccatc 

541 ^ + ^ + ^ 600 

gctgtcctgctagtagtgcggttaggggtggtcggaagcggagcgcacctgcaccggtag 
RQDDHHAN PHQPSPRVDVAI 
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cccgaaggcagcgcctacaacaacacgctggagcacagcctctgcaccgccttcgaatcc 

601 + + + * 

gggcttccgtcgcggacgttgttgtgcgacctcgcgtcggagacgtggcggaagcttagg 

PEGSAYNNTLEHSLCTAFES 

agcaccgtcggcgacgacgcggtcgccaacttcaccgccgtgttcgcgccggcgatcgcc 

661 + + + * 

tcgtggcagccgctgccgcgccagcggttgaagtggcggcacaagcgcggccgctagcgg 

STVGDDAVANFTAVFAPAIA 

cagcgcctggaggccgatcttcccggcgtgcagctgtccaccgacgacgtggtcaacctg 

. a.———— ———4———— 780 

721 + + + + * 

gtcgcggacctccggctagaagggccgcacgtcgacaggtggctgctgcaccagttggac 

QRLEADLPGVQLST DDVVNL 

atggccatgtgtccgttcgagacggtcagcctgaccgacgacgcgcacacgctgtcgccg 

, . (. — -+ 8 40 

781 + + + * 

taccggtacacaggcaagctctgccagccggactggctgctgcgcgtgtgcgacagcggc 

MAMCPFETVSLTD DAHTLSP 

ttctgcgacctcttcacggccactgagtggacgcagtacaactacctgctctcgctggac 

9 41 + + + + 

aagacgctggagaagtgccggtgactcacccgcgtcatgttgatggacgagagcgacctg 

FCDLFTATEWTQYNYLLSLD 

aagtactacggctacggcgggggcaatccgctgggtccggtgcagggggtcggctgggcg 

J 1. h — + 960 

901 + + * 

ttcatgatgccgatgccgcccccgttaggcgacccaggccacgtcccccagccgacccgc 

KYYGYGGGNPLGPVQGVGWA 
aacgagctgatggcgcggctaacgcgcgcccccgtgcacgaccacacctgcgtcaacaac 

, .J— —— ———— — — — — — — 4* lO^U 

9 61 + + + + 

ttgctcgactaccgcgccgattgcgcgcgggggcacgtgctggtgtggacgcagttgttg 

NELMARLTRAPVHDHTCVNN 

accctcgacgcgagtccggccaccttcccgctgaacgccaccctctacgccgacttctcc 

■ u t— y 108 0 

1021 + + + * 

tgggagctgcgctcaggccggtggaagggcgacttgcggtgggagatgcggctgaagagg 

TLDASPATFPLNATLYADFS 

cacgacagcaacctggtgtcgatcttctgggcgctgggcctgtacaacggcaccgcgccg 

10 81 + + * * 

gtgctgtcgttggaccacagctagaagacccgcgacccggacatgttgccgtggcgcggc 

HDSNLVS IFWALGLYNGTAP 

ccgtcgcagacctccgtcgagagcgtctcccagacggacgggtacgccgccgcctggacg 

J. t. + 1200 

1141 + + ^ ^ 

gacagcgtctggaggcagctctcgcagagggtctgcctgcccatgcggcggcggacctgc 

LSQTSVESVSQTDGYAAAWT 
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gtgccgttcgccgctcgcgcgtacgtcgagatgatgcagtgtcgcgccgagaaggagccg 

1201 + + + + + + 1260 

cacggcaagcggcgagcgcgcatgcagctctactacgtcacagcgcggctcttcctcggc 
VPFAARAYVEMMQCRAEKEP 

ctggtgcgcgtgctggtcaacgaccgggtcatgccgctgcatggctgccctacggacaag 

12 61 + + + H + + 1320 

gaccacgcgcacgaccagttgctggcccagtacggcgacgtaccgacgggatgcctgttc 
LVRVLVNDRVMP LHGCPTDK 

ctggggcggtgcaagcgggacgctttcgtcgcggggctgagctttgcgcaggcgggcggg 

1321 + + + + + + 1380 

gaccccgccacgttcgccctgcgaaagcagcgccccgactcgaaacgcgtccgcccgccc 
LGRCKRDAFVAGLSFAQAGG 

aactgggcggattgtttctgatgttgagaagaaaggtagatagataggtagtacatatgg 

1381 + + + + + + 1440 

ttgacccgcctaacaaagactacaactcttctttccatctatctatccatcatgtatacc 
N W A D C F ^ 

attgctcggctctgggtcgttgcccacaatgcatattacgcccgtcaactgccttgcgcc 

1441 + + + + + + 1500 

taacgagccgagacccagcaacgggtgttacgtataatgcgggcagttgacggaacgcgg 

atccacctctcaccctggacgcaaccgagcggtctaccctgcacacggcttccaccgcga 

1501 + + + + + + 1560 

taggtggagagtgggacctgcgxtggctcgccagatgggacgtgtgccgaaggtggcgct 

cgcgcacggataaggcgcttttgttacggggttggggctgggggcagccggagccggaga 

1561 + + + + + — + 1620 

gcgcgtgcctattccgcgaaaacaatgccccaaccccgacccccgtcggcctcggcctct 

gagagaccagcgtgaaaaacgacagaacatagatatcaattcgacgccaattcatgcaga 

1621 + + + + + + 1680 

ctctctggtcgcactttttgctgtcttgtatctatagttaagctgcggttaagtacgtct 

gtagtatacagacgaactgaaacaaacacatcacttccctcgctcctctcctgtagaaga 

1681 + + + + + + 1740 

catcatatgtctgcttgactttgtttgtgtagtgaagggagcgaggagaggacatcttct 

cgctcccaccagccgcttctggcccttattcccgtacgctaggtagaccagtcagccaga 

1741 + 4- + + + + 1800 

gcgagggt.ggtcggcgaagaccgggaataagggcatgcgatccatctggt.cagtcggtct 

cgcatgcctcacaagaacgggggcgggggacacactccgctcgtacagcacccacgacgt 

1801 + + + + + + 1860 

gcgtacggagtgttcttgcccccgccccctgtgtgaggcgagcatgtcgtgggtgctgca 

gtacaggaaaaccggcagcgccacaatcgt-cgagagccatctgcaggaattc 

18 61 + + + + +— 1912 

catgtccttttggccgtcgcggtgttagcagctctcggtagacgtccttaag 



52 



EP 0 684 313 A2 




53 



w 



EP 0 684 313 A2 




54 



EP 0 684 313 A2 
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1271 



,«rr21 1 UUgi"9=«ga«ga=gacgcgcacacc,ctgticg=C9ttctgcga=c 50 
.t,rr21 51 iiUiiicgccgccgagtggacgcagtacaact.cctgctctcgctgg.c 10( 



9al 1322 aagtactacggc 1333 

I I I I I I I I I i I 
aterr21 101 aagtactacgtc 112 



6 

9al 1507 



caacaacctggtgtcgatcttctgggcgctgggcctgtacaacggcaccg 

^1? II Ml mi 1 11111111111:1111111 11 1 1 I I II I 1 Mill 
aterrsa 1 iigtiiiiiggtgtcgatctrctggxcgctgggtctgtacaacggcacca 

9al 1557 cgccgctgrcgcagacctccgtcgagagcgtctcccagacg 1597 

III I I II I I 1 1 I 1 I I I I I I I I I " ' " 'i!' 91 
aterrsa 51 igcccccgtcgcagaccaccgtggaggacatcacccggacg 91 
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